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PREFACE 

During recent years there has been an ever increasing interest in modern 
algebra not only of students in mathematics but also of those in physics, 
chemistry, psychology, economics, and statistics. My Modern Higher Alge- 
bra was intended, of course, to serve primarily the first of these groups, and 
its rather widespread use has assured me of the propriety of both its con- 
tents and its abstract mode of presentation. This assurance has been con- 
firmed by its successful use as a text, the sole prerequisite being the subject 
matter of L. E. Dickson's First Course in the Theory of Equations. However, 
I am fully aware of the serious gap in mode of thought between the intuitive 
treatment of algebraic theory of the First Course and the rigorous abstract 
treatment of the Modern Higher Algebra, as well as the pedagogical difficulty 
which is a consequence. 

The publication recently of more abstract presentations of the theory of 
equations gives evidence of attempts to diminish this gap. Another such at- 
tempt has resulted in a supposedly less abstract treatise on modern algebra 
which is about to appear as these pages are being written. However, I have 
the feeling that neither of these compromises is desirable and that it would 
be far better to make the transition from the intuitive to the abstract by the 
addition of a new course in algebra to the undergraduate curriculum in 
mathematics a curriculum which contains at most two courses in algebra 
and these only partly algebraic in content. 

This book is a text for such a course. In fact, its only prerequisite ma- 
terial is a knowledge of that part of the theory of equations given as a chap- 
ter of the ordinary text in college algebra as well as a reasonably complete 
knowledge of the theory of determinants. Thus, it would actually be pos- 
sible for a student with adequate mathematical maturity, whose only train- 
ing in algebra is a course in college algebra, to grasp the contents. I have used 
the text in manuscript form in a class composed of third- and fourth-year 
undergraduate and beginning graduate students, and they all seemed to find 
the material easy to understand. I trust that it will find such use elsewhere 
and that it will serve also to satisfy the great interest in the theory of matrices 
which has been shown me repeatedly by students of the social sciences. 

I wish to express my deep appreciation of the fine critical assistance of 
Dr. Sam Perils during the course of publication of this book. 

UNIVERSITY OF CHICAGO A. A. ALBERT 

September 9, 1940 

v 
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CHAPTER I 
POLYNOMIALS 

1. Polynomials in x. There are certain simple algebraic concepts with 
which the reader is probably well acquainted but not perhaps in the termi- 
nology and form desirable for the study of algebraic theories. We shall thus 
begin our exposition with a discussion of these concepts. 

We shall speak of the familiar operations of addition, subtraction, and 
multiplication as the integral operations. A positive integral power is then 
best regarded as the result of a finite repetition of the operation of multi- 
plication. 

A polynomial f(x) in x is any expression obtained as the result of the ap- 
plication of a finite number of integral operations to x and constants. If 
g(x) is a second such expression and it is possible to carry out the operations 
indicated in the given formal expressions for/(z) and g(x) so as to obtain 
two identical expressions, then we shall regard f(x) and g(x) as being the 
same polynomial. This concept is frequently indicated by saying that f(x) 
and g(x) are identically equal and by writing f(x) = g(x). However, we 
shall usually say merely that /(a;) and g(x) are equal polynomials and write 
f(x) = g(x). We shall designate by the polynomial which is the constant 
zero and shall call this polynomial the ze^o polynomial. Thus, in a discus- 
sion of polynomials f(x) = will mean that f(x) is the zero polynomial. 
No confusion will arise from this usage for it will always be clear from the 
context that, in the consideration of a conditional equation f(x) = where 
we seek a constant solution c such that /(c) = 0, the polynomial f(x) is not 
the zero polynomial. We observe that the zero polynomial has the 
properties 

g(x) = , + g(x) = g(x) 

for every polynomial g(x). 

Our definition of a polynomial includes the use of the familiar term con- 
stant. By this term we shall mean any complex number or function inde- 
pendent of x. Later on in our algebraic study we shall be much more ex- 
plicit about the meaning of this term. For the present, however, we shall 
merely make the unprecise assumption that our constants have the usual 
properties postulated in elementary algebra. In particular, we shall assume 
the properties that if a and 6 are constants such that ab = then either 

1 



2 INTRODUCTION TO ALGEBRAIC THEORIES 

a or 6 is zero; and if a is a nonzero constant then a has a constant inverse 
a"" 1 such that aar 1 1. 

If f(x) is the label we assign to a particular formal expression of a poly- 
nomial and we replace x wherever it occurs in f(x) by a constant c, we ob- 
tain a corresponding expression in c which is the constant we designate by 
/(c). Suppose now that g(x) is any different formal expression of a poly- 
nomial in x and that f(x) = g(x) in the sense defined above. Then it is 
evident that /(c) = g(c). Thus, in particular, if /&(x), q(x), T(X) are poly- 
nomials in x such that }(x) = h(x)q(x) + r(x) then f(c) = h(c)q(c) + r(c) 
for any c. For example, we have f(x) = a; 3 2z 2 + 3z, h(x) = re 1, 
g(x) = x 2 x, r(x) = 2z, and are stating that for any c we have c 8 
2c 2 + 3c= (c- l)(c 2 - c) + 2c. 

If the indicated integral operations in any given expression of a poly- 
nomial f(x) be carried out, we may express f(x) as a sum of a finite num- 
ber of terms of the form ax k . Here k is a non-negative integer and a is a 
constant called the coefficient of x k . The terms with the same exponent fc 
may be combined into a single term whose coefficient is the sum of all their 
coefficients, and we may then write 

(1) f(x) = a x n + aix n ~ l + . . . + a n -\x + a n . 

The constants a are called the coefficients of /(x) and may be zero, but 
unless f(x) is the zero polynomial, we may always take ao ^ 0. The ex- 
pression (1) of f(x) with do 5^ is most important since, if g(x) is a second 
polynomial and we write g(x) in the corresponding form 

(2) g(x) = 6 z m + fciz"- 1 +...+& 

with 60 5^ 0, then/(x) and g(x) are equal if and only if m = n, a = 6* for 
i = 0, . . . , n. In other words, we may say that the expression (1) of a 
polynomial is unique, that is, two polynomials are equal if and only if their 
expressions (1) are identical. 

The integer n of any expression (1) of f(x) is called the virtual degree of 
the expression (1). If a 7* we call n the degree* of f(x). Thus, either any 
f(x) has a positive integral degree, or /(#) = a n is a constant and will be 
called a constant polynomial in x. If, then, a n ^ we say that the con- 
stant polynomial f(x) has degree zero. But if a n = 0, so that f(x) is the 
zero polynomial, we shall assign to it the degree minus infinity. This will 

* Clearly any polynomial of degree no may be written as an expression of the form (1) 
of virtual degree any integer n ^ no. We may thus speak of any such n as a virtual de- 
gree of /(a;). 
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be done so as to imply that certain simple theorems on polynomials shall 
hold without exception. 

The coefficient ao in (1) will be called the virtual leading coefficient of this 
expression of f(x) and will be called the leading coefficient oif(x) if and only 
if it is not zero. We shall call/(x) a monic polynomial if ao = 1. We then 
have the elementary results referred to above, whose almost trivial verifica- 
tion we leave to the reader. 

LEMMA 1. The degree of a product of two polynomials f (x) and g(x) is the 
sum of the degrees of f(x) and g(x). The leading coefficient of f(x) g(x) is 
the product of the leading coefficients of f (x) and g(x), and thus, if f (x) and 
g(x) are monic, so is f(x) g(x). 

LEMMA 2. A product of two nonzero polynomials is nonzero and is a con- 
stant if and only if both factors are constants. 

LEMMA 3. Let f(x) be nonzero and such that f(x)g(x) = f(x)h(x). Then 
g(x) = h(x). 

LEMMA 4. The degree of f (x) + g(x) is at most the larger of the two degrees 
of f (x) and g(x). 

EXERCISES* 

1. State the condition that the degree of /(x) + g(x) be less than the degree of 
either /(x) or g(x). 

2. What can one say about the degree of f(x) + g(x) if /(x) and g(x) have posi- 
tive leading coefficients? 

3. What can one say about the degree of/ 2 , of/ 3 , of/* for/ = f(x) a polynomial, 
k a positive integer? 

4. State a result about the degree and leading coefficient of any polynomial 
s(x) = /i +.+/? for t > 1, / = /(x) a polynomial in x with real coefficients. 

5. Make a corresponding statement about g(x)s(x) where g(x) has odd degree 
and real coefficients, s(x) as in Ex. 4. 

6. State the relation between the term of least degree in f(x)g(x) and those of 
least degree in /(x) and g(x). 

7. State why it is true that if x is not a factor of f(x) or g(x) then x is not a fac- 
tor of f(x)g(x). 

8. Use Ex. 7 to prove that if k is a positive integer then x is a factor of [/(x)]* if 
and only if x is a factor of /(x). 

9. Let / and g be polynomials in x such that the following equations are satisfied 
(identically). Show, then, that both /and g are zero. Hint: Verify first that other- 

* The early exercises in our sets should normally be taken up orally. The author's 
choice of oral exercises will be indicated by the language employed. 



4 INTRODUCTION TO ALGEBRAIC THEORIES 

wise bothf and g are not zero. Express each equation in the form a(x) = b(x) and 
apply Ex. 3. In parts (c) and (d) complete the squares. 

a) / 2 + xg* = c) / 4 + 2xfY + (* 2 - *)0 4 = 

6) / - zy = d) / 2 + 2zfa - a# 2 = 



^ 10. Use Ex. 8 to give another proof of (a), (6), and (5) of Ex. 9. Hint: Show that 
if / and g are nonzero polynomial solutions of these equations of least possible de- 
grees, then x divides/ = xfi as well as g = xgi. But then/i and g\ are also solutions 
a contradiction. 

"* 11. Use Ex. 4 to show that if/, g, and h are polynomials in x with real coefficients 
satisfying the following equations (identically), then they are all zero: 

a) / 2 - xg* = xh* 

b) / 2 - xg* + h* = 

c) / 2 + g 2 + (x + 2)fc 2 = 

^ 12. Find solutions of the equations of Ex. 11 for polynomials/, g, h with complex 
coefficients and not all zero. 

2. The division algorithm. The result of the application of the process 
ordinarily called long division to polynomials is a theorem which we shall 
call the Division Algorithm for polynomials and shall state as 

Theorem 1. Let f (x) and g(x) be polynomials of respective degrees n and m, 
g(x) 7* 0. Then there exist unique polynomials q(x) and r(x) such that r(x) 
has virtual degree m 1, q(x) is either zero or has degree n m, and 

(3) f (x) = q(x)g(x) + r(x) . 

For let /(a;) and g(x) be defined respectively by (1) and (2) with 6 ^ 0. 
Then, either n < m and we have (3) with q(x} = 0, r(x) = f(x), ora j* 0, 
n > m. If Ck is the virtual leading coefficient of a polynomial h(x) of virtual 
degree m + k > m, a virtual degree of h(x) b^ l CkX k g(x) is m + k 1. 
Thus a virtual degree of f(x) b^ l a Q x n '^ n g(x) is n 1, and a finite repetition 
of this process yields a polynomial r(x) = f(x) b^ l (a x n ~ + . . .)g(x) of 
virtual degree m 1, and hence (3) for q(x) of degree n m and leading 
coefficient a^ 1 ^ 0. If also f(x) = q Q (x)g(x) + r Q (x) for r (z) of virtual 
degree m 1, then a virtual degree of s(x) = r Q (x) r(x) is m 1. But 
Lemma 1 states that if t(x) = q(x) q Q (x) 7* the degree of s(x) = 
t(x)g(x) is the sum of m and the degree of t(x). This is impossible; and 
t(x) = 0, g(z) = ?o(z), r(x) = r Q (x). 

The Remainder Theorem of Algebra states that if we use the Division 
Algorithm to write 

/Or) = q(x)(x - c) +r(x) , 
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so that g(x) = x c has degree one and r = r(x) is necessarily a constant, 
then r = /(c). The obvious proof of this result is the use of the remark in 
the fifth paragraph of Section 1 to obtain /(c) = q(c)(c c) + r, /(c) = r 
as desired. It is for this application that we made the remark. 

The Division Algorithm and Remainder Theorem imply the Factor Theorem 
a result obtained and used frequently in the study of polynomial equa- 
tions. We shall leave the statements of that theorem, and the subsequent 
definitions and theorems on the roots and corresponding factorizations of 
polynomials* with real or complex coefficients, to the reader. 

* If f(x) is a polynomial in x and c is a constant such that /(c) = then we shall 
call c a root not only of the equation /(x) = but also of the polynomial /(x). 

EXERCISES 

1. Show by formal differentiation that if c is a root of multiplicity m of f(x) = 
(x c) m q(x) then c is a root of multiplicity m 1 of the derivative /'(x) of /(x). 
What then is a necessary and sufficient condition that /(x) have multiple roots? 

2. Let c be a root of a polynomial /(x) of degree n and ordinary integral coeffi- 
cients. Use the Division Algorithm to show that any polynomial h(c) with rational 
coefficients may be expressed in the form 6 + &ic + + b n -ic n ~ l for rational 
numbers 6 , . . . , 6 n -i. Hint: Write h(x) = q(x)f(x) + r(x) and replace x by c. 

3. Let /(x) = x 3 + 3x 2 + 4 in Ex. 2. Compute the corresponding &< for each of 
the polynomials 

a) c 6 + 10c 4 + 25c 2 c) c 6 - 2c 4 + c 2 

6) c 4 + 4c 3 + 6c 2 + 4c + 1 d) (2c 2 + 3)(c 3 + 3c) 

3. Polynomial divisibility. Let f(x) and g(x) ^ be polynomials. Then 
by the statement that g(x) divides f (x) we mean that there exists a poly- 
nomial q(x) such that/(x) = q(x)g(x). Thus, g(x) T divides /(x) if and 
only if the polynomial r(x) of (3) is the zero polynomial, and we shall say 
in this case that f(x) has g(x) as a factor, g(x) is a factor o/f(x). 

We shall call two nonzero polynomials/(x) and g(x) associated polynomials 
if f(x) divides g(x) and g(x) divides /(#). Then f(x) = q(x)g(x)> g(x) = 
h(x)f(x), so that/(x) = q(x)h(x)f(x). Applying Lemmas 3 and 2, we have 
q(x)h(x) = 1, q(x) and h(x) are nonzero constants. Thus f(x) andg(x)are 
associated if and only if each is a nonzero constant multiple of the other. 

It is clear that every nonzero polynomial is associated with a monic poly- 
nomial. Observe thus that the familiar process of dividing out the leading 
coefficient in a conditional equation f(x) = is that used to replace this 
equation by the equation g(x) =0 where g(x) is the monic polynomial 
associated with/(x). 
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Two associated monic polynomials are equal. We see from this that if 
g(x) divides f(x) every polynomial associated with g(x) divides f(x) and 
that one possible way to distinguish a member of the set of all associates 
of g(x) is to assume the associate to be monic. We shall use this property 
ater when we discuss the existence of a unique greatest common divisor 
(abbreviated, g.c.d.) of polynomials in x. 

In our discussion of the g.c.d. of polynomials we shall obtain a property 
which may best be described in terms of the concept of rational function. 
It will thus be desirable to arrange our exposition so as to precede the study 
of greatest common divisors by a discussion of the elements of the theory of 
polynomials and rational functions of several variables, and we shall do so. 

EXERCISES 

1. Let/ = f(x) be a polynomial in x and define m(f) = x m f(l/x) for every posi- 
tive integer m. Show that m(f) is a polynomial in x of virtual degree m if and only 
if m is a virtual degree of /(x). 

2. Show that m(0) = 0, m[m(f)} = /. 

3. Define/ = if / = 0, and/ = n(f) if /is any nonzero polynomial of degree n. 
Show that m(f) = x m " n j for every m > n and that, if / 7* 0, x is not a factor of/. 

4. Let g be a factor of/. Prove that $ is a factor of m(f) for every m which is at 
least the degree of /. 

4. Polynomials in several variables. Some of our results on polynomials 
in x may be extended easily to polynomials in several variables. We define 
a polynomial/ = f(x\, . . . , x q ) in xi, . . . , x q to be any expression obtained 
as the result of a finite number of integral operations on #1, . . . , x q and 
constants. As in Section 1 we may express f(xi, . . . , x q ) as the sum of a 
finite number of terms of the form 

(4) azji x$ . . . x k q . 

We call a the coefficient of the term (4) and define the virtual degree in 
Xi, . jX q of such a term to be k\ + . . . + k q , the virtual degree of a par- 
ticular expression of / as a sum of terms of the form (4) to be the largest of 
the virtual degrees of its terms (4). If two terms of / have the same set of 
exponents k\, . . . , k q , we may combine them by adding their coefficients 
and thus write /as the unique sum, that is, the sum with unique coefficients, 

(5) / = f(xi, ...,*)= 

k - 0, 1, 
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Here the coefficients a^ . . . k q are constants and n/ is the degree of 
. . . , x q ) considered as a polynomial in Xj alone. Also / is the zero poly- 
nomial if and only if all its coefficients are zero. If / is a nonzero poly- 
nomial, then some a kl . . . k q T* 0, and the degree of / is defined to be the 
maximum sum fci + . . . + k q for a kl . . . * fl ^ 0. As before we assign the 
degree minus infinity to the zero polynomial and have the property that 
nonzero constant polynomials have degree zero. Note now that a poly- 
nomial may have several different terms of the same degree and that con- 
sequently the usual definition of leading term and coefficient do not apply. 
However, some of the most important simple properties of polynomials in 
x hold also for polynomials in several x v -, and we shall proceed to their 
derivation. 

We observe that a polynomial / in x\, . . . , x q may be regarded as a 
polynomial (1) of degree n = n^in x = x q with its coefficients a , . . . , a n all 
polynomials in x\, . . . , x q -\ and a not zero. If, similarly, g be given by 
(2) with 6 not zero, then a virtual degree in x q oifg is ra + n, and a virtual 
leading coefficient of fg is a &o. If q = 2, then a and 6 are nonzero poly- 
nomials in Xi and ao6 ^ by Lemma 2. Then we have proved that the 
product fg of two nonzero polynomials / and g in x\, x 2 is not zero. If we 
prove similarly that the product of two nonzero polynomials in x\ } . . . , x q -i 
is not zero, we apply the proof above to obtain ao&o ^ and hence have 
proved that the product fg of two nonzero polynomials in #1, . . . , x q is not 
zero. We have thus completed the proof of 

Theorem 2. The product of any two nonzero polynomials in Xi, . . . , XQ 
is not zero. 

We have the immediate consequence 

Theorem 3. Let f, g, h be polynomials in Xi, . . . , Xq and f be nonzero, 
fg = fh. Then g = h. 

To continue our discussion we shall need to consider an important special 
type of polynomial. Thus we shall call /(xi, . . . , x q ) a homogeneous poly- 
nomial or a form in Xi, . . . , x q if all terms of (5) have the same degree 
k = ki + . . . + k q . Then, if / is given by (5) and we replace x> in (5) by 
yx^ we see that each power product x . . . x is replaced by y k ^~ + *Xj> 
. . . x q and thus that the polynomial/(i/xi, . . . , yx q ) = y k f(xij . . . , x q ) identi- 
cally in y t Xi, . . . , x q if and only if /(xi, . . . , x q ) is a form of degree k 
in xi, . . . , x q . 

The product of two forms / and g of respective degrees n and m in the 
same Xi, . . . , x q is clearly a form of degree m + n and, by Theorem 2, is 
nonzero if and only if / and g are nonzero. We now use this result to obtain 
the second of the properties we desire. It is a generalization of Lemma 1. 
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Observe first that all the terms of the same degree in a nonzero poly- 
nomial (5) may be grouped together into a form of this degree and then 
we may express (5) uniquely as the sum 

(6) / = /(*!,...,*,) =/o+-..+/n, 

where /o is a nonzero form of the same degree n as the polynomial/ and /< is 
a form of degree n i. If also 

(7) g = g(xi, . . . , x q ) = g + . . . + g m , 

for forms g> of degree m i and such that g Q ^ 0, then clearly 

(8) fg = h + . . . + h m+n , 

where the hi are forms of degree m + n i and Ao = /o(7o. By Theorem 2 
Ao T* 0. Thus if we call/o the leading form of/, we clearly have 

Theorem 4. Let f and g be polynomials in xi, . . . , Xq. Then the degree 
of f g is the sum of the degrees of f and g and the leading form of f g is the prod- 
uct of the leading forms of f and g. 

The result above is evidently fundamental for the study of polynomials 
in several variables a study which we shall discuss only briefly in these 
pages. 

5. Rational functions. The integral operations together with the opera- 
tion of division by a nonzero quantity form a set of what are called the 
rational operations. A rational function of xi, . . . , x q is now defined to be 
any function obtained as the result of a finite number of rational operations 
on xi f . . . , x q and constants. The postulates of elementary algebra were 
seen by the reader in his earliest algebraic study to imply that every rational 
function of x\> . . . , x q may be expressed as a quotient 



for polynomials a(xi, . . . , x q ) and 6(xi, . . . , x q ) 7* 0. The coefficients of 
a(xi, . . . , x q ) and 6(xi, . . . , x q ) are then called coefficients of/. Let us ob- 
serve then that the set of all rational functions in xi, . . . , x q with complex 
coefficients has a property which we describe by saying that the set is 
closed with respect to rational operations. By this we mean that every rational 
function of the elements in this set is in the set. This may be seen to be 
due to the definitions a/6 + c/d = (ad + 6c)/6d, (a/6) (c/d) = (ac)/(6d). 
Here b and d are necessarily not zero, and we may use Theorem 2 to obtain 
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bd 7* 0. Observe, then, that the set of rational functions satisfies the prop- 
erties we assumed in Section 1 for our constants, that is, fg = if and only 
if / = or g = 0, while if / j then /- 1 exists such that //~ 1 = 1. 

6. A greatest common divisor process. The existence of a g.c.d. of two 
polynomials and the method of its computation are essential in the study 
of what are called Sturm's functions and so are well known to the reader 
who has studied the Theory of Equations. We shall repeat this material here 
because of its importance for algebraic theories. 

We define the g.c.d. of polynomials f\(x), . . . , /,(x) not all zero to be any 
monic polynomial d(x) which divides all the fi(x), and is such that if g(x) di- 
vides every fj(x) then g(x) divides d(x). If do(x) is a second such polynomial, 
then d(x) and d Q (x) divide each other, d(x) and d (x) are associated monic 
polynomials and are equal. Hence, according to our definition, the g.c.d. 
of /i(x), . . . , f t (x) is a unique polynomial. 

If g(x) divides all the/i(x), then g(x) divides d(x), and hence the degree 
of d(x) is at least that of g(x). Thus the g.c.d. d(x) is a common divisor 
of the /i(x) of largest possible degree and is clearly the unique monic com- 
mon divisor of this degree. 

If dj(x) is the g.c.d. of /i(x), . . . , /,<x) and d (x) is the g.c.d. of d,-(x) and 
//+i(aO, then d (x) is the g.c.d. of /i(x), . . . ,//+i(x). For every common 
divisor h(x) of /i(x), . . . ,/,-+i(x) divides /i(x), . . . ,/,-(x), and hence both 
dj(x) and //+i(x), h(x) divides d (z). Moreover, d (x) divides /,-+i(x) and 
the divisor d,-(x) of /i(x), . . . ,/,-(x), d Q (x) divides /i(x), . . . ,fs+i(x). 

The result above evidently reduces the problems of the existence and 
construction of a g.c.d. of any number of polynomials in x not all zero to 
the case of two nonzero polynomials. We shall now study this latter prob- 
lem and state the result we shall prove as 

Theorem 5. Let f (x) and g(x) be polynomials not both zero. Then there 
exist polynomials a(x) and b(x) such that 

(10) d(x) = a(x)f(x) + b(x)g(x) 

is a monic common divisor o/f(x) and g(x). Moreover, d(x) is then the unique 
g.c.d. 0/f(x) and g(x). 

For if /(x) = 0, then d(x) is associated with g(x),a(x) = Iand6(x) = b^ 1 
is a solution of ( 10) if g(x) is given by (2) . Hence, there is no loss of generality 
if we assume that both f(x) and g(x) are nonzero and that the degree of 
g(x) is not greater than the degree of /(x). For consistency of notation 
we put 

(11) *.(*)-/(*), h l (x) = g(x). 
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By Theorem 1 

(12) h (x) = q^h^x) + h 2 (x) , 



where the degree of h z (x) is less than the degree of hi(x). If ht(x) ^ 0, we 
may apply Theorem 1 to obtain 

(13) hi(x) = fc(x)M*) + W*) , 

where the degree of hz(x) is less than that of h^(x). Thus our division process 
yields a sequence of equations of the form 

(14) hi-<t(x) = qi-i(x)hi-i(x) + hi(x) , 

where if n< is the degree of hi(x) then Hi > n > . . . , while n< > unless 
hi(x) = 0. We conclude that our sequence must terminate with 

(15) WX) - g r _i(oO/l r _i(z) + hr(x) 

and 

(16) *,(*) ^ , A r _i(z) = q r (x)h r (x) 

for r > 1. 

Equation (16) implies that (15) may be replaced by h r ^(x) = 
[q r -i(x)q r (x) + l]h r (x). Thus h r (x) divides both h r ~i(x) and h r -i(x). If we 
assume that h r (x) divides hi(x) and At_i(x), then (14) implies that h r (x) 
divides hi-.$(x). An evident proof by induction shows that h r (x) divides 
both ho(x) = f(x) and hi(x) = g(x). 

Equation (12) implies that h z (x) = a*(x)f(x} + b^(x)g(x) with a^(x} = 1, 
b*(x) = qi(x). Clearly also h\(x) = a\(x)j(x) + bi(x)g(x) with ai(x) = 0, 
bi(x) = 1. If, now, Ai- 2 () = ai- 2 ()/(x) + 6<-i(a?)g(x) and Ai_i(x) = 
+ 6i_i(o:)gr(a:) then (14) implies that Ai(x) = 



Thus we obtain /i r (o;) = a r (x)f(x) + b r (x)g(x). The polynomial fe r (z) is a 
common divisor of /(#) and gf(x) and is associated with a monic common 
divisor d(x) = ch r (x). Then d(x) has the form (10) for a(x) = ca r (x),b(x) = 
cb r (x). We have already shown that d(x) is unique. 

The process used above was first discovered by Euclid, who utilized it in 
his geometric formulation of the analogous result on the g.c.d. of integers. 
It is therefore usually called Euclid's process. We observe that it not only 
enables us to prove the existence of d(x) but gives us a finite process by 
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means of which d(x) may be computed. Notice finally that d(x) is computed 
by a repetition of the Division Algorithm on /(#), g(x) and polynomials se- 
cured from f(x) and g(x) as remainders in the application of the Division 
Algorithm. But this implies the result we state as 

Theorem 6. The polynomials a(x), b(x), and hence the greatest common 
divisor d(x) of Theorem 5 all have coefficients which are rational functions 
with rational number coefficients of the coefficients of f (x) and g(x). 

We thus have the 

COROLLARY. Let the coefficients of f (x) and g(x) be rational numbers. Then 
the coefficients of their g.c.d. are rational numbers. 

If the only common divisors of f(x) and g(x) are constants, then d(x) = 1 
and we shall call f(x) and g(x) relatively prime polynomials. We shall also 
indicate this at times by saying that f(x) is prime to g(x) and hence also 
that g(x) is prime to/(x). When/(x) and g(x) are relatively prime, we use 
(10) to obtain polynomials a(x) and b(x) such that 

(17) a(x)f(x) + g(x)b(x) = 1 . 

It is interesting to observe that the polynomials a(x) and b(x) in (17) are 
not unique and that it is possible to define a certain unique pair and then 
determine all others in terms of this pair. To do this we first prove the 

LEMMA 5. Let f(x), g(x), and h(x) be nonzero polynomials such that f(x) 
is prime to g(x) and divides g(x)h(x). Then f(x) divides h(x). 

For we may write g(x)h(x) = f(x)q(x) and use (17) to obtain [a(x)f(x) + 
b(x)g(x)]h(x) = [a(x)h(x) + b(x)q(x)]f(x) = h(x) as desired. 

We now obtain 

Theorem 7. Let f (x) of degree n and g(x) of degree m be relatively prime. 
Then there exist unique polynomials a (x) of degree at most m 1 and b (x) 
of degree at most n 1 such that a (x)f(x) + b (x)g(x) = 1. Every pair of 
polynomials a(x) and b(x) satisfying (17) has the form 

(18) a(x) = a (x) + c(x)g(x) , b(x) = b (x) - c(x)f(x) 

for a polynomial c(x). 

For, if a(x) is any solution of (17), we apply Theorem 1 to obtain the 
first equation of (18) with a$(x) the remainder on division of a(x) by g(x). 
Then a (x) has degree at most m 1, a(x)f(x) + b(x}g(x) = a Q (x)f(x) + 
[b(x) + c(x)f(x)]g(x) = 1. We define b Q (x) = b(x) + c(x)f(x) and see that 
6 (x)g(x) = ao(x)/(rc) + 1 has degree at most m + n 1. By Lemma 1 
the degree of 6 (z) is at most n - 1, a*(x)f(x) + b*(x)g(x) = 1 as desired. 
If now ai(s) has virtual degree m - 1, bi(x) virtual degree n 1 and 
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a\(x)f(x) + bi(x)g(x) = a$(x)j(x) + b Q (x)g(x)j then /(x) clearly divides 
[60(2) bi(x)]g(x). By Lemma 5 the polynomial 60(2) bi(x) of virtual 
degree n 1 is divisible by /(x) of degree n and must be zero. Hence, 
&o(z) = 6i(x),sothatai(x)/(x) = a (x)/(x),ai(x) = a (x). This proves a (x) 
and 6o(#) unique. But the definition above of a$(x) as the remainder on 
division of a(x) by g(x) shows that then (18) holds. 

There is also a result which is somewhat analogous to Theorem 7 for the 
case where /(x) and g(x) are not relatively prime. We state it as 

Theorem 8. Let f(x) T and g(x) 7* have respective degrees n and m. 
Then polynomials a(x) j of degree at most m 1 and b(x) -^ of degree 
at most n 1 such that 

(19) a(x)f(x) + b(x)g(x) = 

exist if and only if f (x) and g(x) are not relatively prime. 

For if the g.c.d. of f(x) and g(x) is a nonconstant polynomial d(x), we 
have/(x) = fi(x)d(x), g(x) = 0i(z)d(x), 0i(x)/(x) + [-/i(2)ff()l = where 
g\(x) has degree less than m and /i(x) has degree less than n. Conversely, 
let (19) hold. If /(x) and g(x) are relatively prime, we have a (x)/(x) + 
&o(z)gr(z) = 1, a(x) = a (x)a(x)/(x) + a(x)6 (x)gf(x) = flf(x)[o(x)6 (a;) - 
ao(x)6(x)]. But then gr(x) of degree m divides a(x) ^ of degree at most 
m 1 which is impossible. 

EXERCISES 

1. Extend Theorems 5, 6, and the corollary to a set of polynomials /i(x), . . . , 
/<(*) 

2. Let/i(x), . . . ,/t(x) be all polynomials of the first degree. State their pos- 
sible g.c.d.'s and the conditions on the/(x) for each such possible g.c.d. 

3. State the results corresponding to those above for polynomials of virtual 
degree two. 

4. Prove that the g.c.d. of f(x) and g(x) is the monic polynomial of least possible 
degree of the form (10). Hint: Show that if d(x) is this polynomial then f(x) = 
q(x)d(x) + r(s), r(x) has the form (10) as well as degree less than that of d(x) and 
so must be zero. 

5. A polynomial /(x) is called rationally irreducible if f(x) has rational coefficients 
and is not the product of two nonconstant polynomials with rational coefficients. 
What are the possible g.c.d. 's of a set of rationally irreducible /(z) of Ex. 1? 

6. Let f(x) = be rationally irreducible, g(x) have rational coefficients. Show 
that/(x) either divides g(x) or is prime to g(x). Thus, f(x) is prime to g(x) if the 
degree of g(x) is less than that of f(x). 
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7. Use Ex. 1 of Section 2 together with the results above to show that a rational- 
ly irreducible polynomial has no multiple roots. 

8. Find the g.c.d. of each of the following sets of polynomials as well as of all 
possible pairs of polynomials in each case: 

a) /i = 2x 5 - x 8 - 2x 2 - 6x + 4 e) jfi = x 6 + 2x 4 - x 8 - 5x 2 - 6x - 3 
/ 2 = x 4 + x 8 - x 2 - 2x - 2 / 2 = x 8 + x 2 + 3x + 3 

6) /i = 3x 4 + 8x 2 - 3 / 3 = x 4 + x 3 - x - 1 

/, = x 8 + 2x 2 + 3* + 6 /) /i = x 8 + 2x 2 - 3s - 6 

c) /i = x 4 - 2x 3 - 2x 2 - 2x - 3 / 2 = x 2 + x - 2 

/ 2 = z 8 + 6x 2 + llx + 6 / 3 = z 4 + 2x 8 + x + 2 

/ 3 = x 4 - 8x 2 - 5x + 6 

d) /i = x 8 + 4x 2 + x - 6 
/ 2 = x 2 - 3x + 2 

/ 3 = x 8 - 3x 3 + x 2 - 4x - 4 

9. Let /(x) be a rationally irreducible polynomial and c be a complex root of 
/(x) = 0. Show that, if g(x) is a polynomial in x with rational coefficients and g(c) ^ 
0, there then exists a polynomial h(x) of degree less than that of /(x) and with 
rational coefficients such that g(c)h(c) = 1. 

10. Let/(x) be a rationally irreducible quadratic polynomial and c be a complex 
root of /(x) = 0. Show that every rational function of c with rational coefficients 
is uniquely expressible in the form a + be with a and b rational numbers. 

11. Let /i, . . . , ft be polynomials in x of virtual degree n and/i ^ 0. Use Ex. 4 
of Section 3 to show that if d(x) is the g.c.d. of /i, ...,/* then the g.c.d. of /i, ...,/< 
is 3. Thus, show that the g.c.d. of n(/i), . . . , n(f t ) has the form x k d, for an integer 
fc> 0. 

7. Forms. A polynomial of degree n is frequently spoken of as an n-ic 
polynomial. The reader is already familiar with the terms linear, quadratic, 
cubic, quartic, and quintic polynomial in the respective cases n = 1, 2, 3, 4, 5. 

In a similar fashion a polynomial in x\, . . . , x q is called a q-ary poly- 
nomial. As above, we specialize the terminology in the cases q = 1, 2, 3, 
4, 5 to be unary, binary, ternary, quaternary, and quinary. 

The terminology just described is used much more frequently in connec- 
tion with theorems on forms than in the study of arbitrary polynomials. 
In particular, we shall find that our principal interest is in n-ary quadratic 
forms. 

There are certain special forms which are quadratic in a set of variables 
Xi, . . . , x m , 3/1, . . . , t/n and which have special importance because they 
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are linear in both xi, . . . , x m and yi, . . . , y n , separately. We shall call 
such forms bilinear forms. They may be expressed as forms 

y-i ..... n 
(20) f= 



so that we may thus write 
(21) / 



;-l t-1 

and see that / may be regarded as a linear form in yi, . . . , y n whose coeffi- 
cients are linear forms in x\, . . . , x m . 

A bilinear form / is called symmetric if it is unaltered by the interchange 
of correspondingly labeled members of its two sets of variables. This state- 
ment clearly has meaning only if m = n; and / is symmetric if and only if 
/ = IiXiaayj = ZyidijXj. But / = S2/ ; a,;Zi, and hence / is symmetric if and 
only if m = n, 

(22) aij = a ;i (i, j = 1, . . . , n) . 

A quadratic form / is evidently a sum of terms of the type a^ as well as 
the type ctjx&j for i 7* j. We may write an = a, a ; - = a,-i = \ c/ for i -^ j 
and have djXiXj = a^XiXj + a^x^x^ so that 



(23) /= 

t,y = i 

(a*y = a/<; f, j = 1, . . . , n) . 

We compare this with (22) and conclude that a quadratic form may be re- 
garded as the result of replacing the variables j/i, . . . , y n in a symmetric 
bilinear form in x\, . . . , z and yi, . . . , y by a*, . . , , z n , respectively. 
Later we shall obtain a theory of equivalence of quadratic forms and shall 
use the result just derived to obtain a parallel theory of symmetric bilinear 
forms. 

A final type of form of considerable interest is the skew bilinear form. 
Here again m = n, and we call a bilinear form / skew if / = f(x\, . . . , x n ; 
yi, - - , yn) = /(yi, , y n ; x\, . . . , x n ). Thus skew bilinear forms are 
forms of the type 

(24) f 
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where 

(25) an = -an (t, j = 1, . . . , n) . 
It follows that an + an = 0, that is * 

(26) a = (i = 1, . . . , n) . 

Hence / is a sum of terms an(xiyj #) for i j j, i = 1, . . . , n 1, 
j = 2, . . . , ft. It is also evident that if we replace the y/ by corresponding 
Xjj then the new quadratic form/(zi, . . . , x n ; xi, . . . , x n ) is the zero poly- 
nomial. It is important for the reader to observe thus that while (22) may 
be associated with both quadratic and symmetric bilinear forms we must 
associate (25) only with skew bilinear forms. 

ORAL EXERCISES 

1. Use the language above to describe the following forms: 

a) z 3 + Zxy* + 3 d) x\ + 2xfli 

ft) xl + 2/i e) Xiy 2 - x#i 

c) 2xiyi + x 2 yi + x^ 2 

2. Express the following quadratic forms as sums of the kind given by (23) : 

a) 2x\ 
ft) xf - x\ 



8. Linear forms. A linear form is expressible as a sum 
(27) / = aixi + . . . + a n x n 



n . 



We shall call (27) a linear combination of xi, . . . , x n with coefficients 
ai, . . . , a n . The concept of linear combination has already been used with- 
out the name in several instances. Thus any polynomial in # is a linear 
combination of a finite number of non-negative integral powers of x with 
constant coefficients, a polynomial in xi, . . . , x q is a linear combination 
of a finite number of power products x$ . . . xj with constant coefficients, 
the g.c.d. of /(x) and g(x) is a linear combination (10) of f(x) and g(x) with 
polynomials in x as coefficients. 

The form (27) with a\ = a^ = . . . = a n = is the zero form. If g is 
a second form, 



(28) g = ftiXi + . . . + b n x n , 
with constant coefficients fti, . . . , b ny we see that 

(29) / + g = (a x + 61)0:1 + . . . + (a, + b n )x n . 
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Also if c is any constant, we have 

(30) cf = (cai)xi + . . . + (ca n )x n . 

We define / to be the form such that / + ( /) = and see that 

(31) -/=-!./= (-oi)si + . . . + (-a*)x n . 

Then / = g if and only if / g = / + ( g) = 0, that is a< = bi (i = 1, 
. . . , n). 

The properties just set down are only trivial consequences of the usual 
properties of polynomials and, as such, may seem to be relatively unimpor- 
tant. They may be formulated abstractly, however, as properties of se- 
quences of constants (which may be thought of, if we so desire, as the coeffi- 
cients of linear forms) and in this formulation in Chapter IV will be very 
important for all algebraic theory. The reader is already familiar with these 
properties which he has used in the computation of determinants by opera- 
tions on its rows and columns. 

Let, then, u be a sequence 

(32) u = (ai, . . . , a) 

of n constants a called the elements of the sequence u. If a is any constant, 
we define 



(33) au = ua = (aai, . . . , aa n ) 

and call au the scalar product of u by a. We now consider a second sequence, 

(34) v = (61, . . . , 6 n ) , 
and define the sum of u and v by 

(35) u + v = (ai + 61, . . , an + b n ) . 
Then the linear combination 

(36) au + bv = (aai + bbi, . . . , aa n + bb n ) 



has been uniquely defined for all constants a and b and all sequences 
u and v. 

The sequence all of whose elements are zero will be called the zero 
sequence and designated by 0. It is clearly the unique sequence z with the 
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property that u + z = u i or every sequence u. Evidently if a is the zero 
constant au = for every u. 

We define the negative u of a sequence u to be the sequence v such 
that u + v = 0. Evidently, then, u is the unique sequence 

(37) -u = -1 w = (-ai, . . . , -a n ) , 

and we see that the unique solution of the equation u + x = v is the se- 
quence v + ( w). We evidently call this sequence 

(38) v u = (61 ai, . . . , b n a n } . 

The reader should observe now that the definitions and properties derived 
for linear combinations of sequences are precisely those which hold for the 
sequences of coefficients of corresponding linear combinations of linear 
forms and that the usual laws of algebra for addition and multiplication 
hold for addition of sequences and multiplication of sequences by constants. 

9. Equivalence of forms. If / = f(xi, . . . , x g ) is a form of degree n in 
#i, . . . , x q and we replace every x> in/ by a corresponding linear form 

(39) Xi = anyi + . . . + a ir y r (i = 1, . . . , g) , 

we obtain a form g = g(y\, . . . , y r ) of the same degree n in t/i, . . . , y r . 
Then we shall say that / is carried into g (or that g is obtained from /) by 
the linear mapping (39). If q = r and the determinant 



(40). 



is not zero, we shall say that (39) is nonsingular. In this case it is easily 
seen that we may solve (39) for yi, . . . , y 9 as linear forms in #1, . . . , x q and 
obtain a linear mapping which we may call the inverse of (39). This termi- 
nology is justified by the fact that the equation f(xi, . . . , x q ) = g(yi, 
. . . , y q } is an identity, and thus if we replace t/i, . . . , y q in g(y\, . . . , y q ) 
by the corresponding linear forms in x\ t . . . , x q we obtain the original form 
f(xi,...,x q ). - 

We now consider two forms / = f(xi, . . . , x q ) and g = g(x^ . . . , x q ) of 
the same degree n. Then we shall say that / is equivalent to g if / is carried 
into g(y\, . . . , y q ) by a nonsingular linear mapping. The statements above 
imply that if / is equivalent to g then g is also equivalent to /. Thus, we 
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shall usually say simply that / and g are equivalent. We shall not study the 
equivalence of forms of arbitrary degree but only of the special kinds of 
forms described in Section 7, and even of those forms only under restricted 
types of linear mappings. 

We have now obtained the background needed for a clear understanding 
of matrix theory and shall proceed to its development. 

EXERCISES 

1. The linear mapping (39) of the form x< = y* for i = 1, . . . , q is called the 
identical mapping. What is its effect on any form/? 

2. Apply a nonsingular linear mapping to carry each of the following forms to an 
expression of the type a\y\ + a^yl Hint: Write / = ai(xi + cx 2 ) 2 + ... by com- 
pleting the square on the term in x\ and put Xi + cx 2 = y\, x 2 = y*. 

a) 2x1 - 4xiX 2 + 3xi d) 2xf - XiX 2 

b) x? + 14^10:2 + 9x1 e) 3x1 + 2xiX 2 - xi 

c) 3x1 + 18xix 2 + 24x1 

3. Find the inverses of the following linear mappings: 

. f2xi-|- x* = 2/i b } f Xi + ^2^2/1 

' \3xj 4- 2x 2 = 2/ 2 M -2xi + x 2 = 2/2 

4. Apply the linear mappings of Ex. 3 to the following forms / to obtain equiva- 
lent forms g and their inverses to g to obtain /. 



6) / = 4x? - 4x!X 2 + 3x1 



CHAPTER II 

RECTANGULAR MATRICES AND ELEMENTARY 
TRANSFORMATIONS 

1. The matrix of a system of linear equations. The concept of a rec- 
tangular matrix may be thought of as arising first in connection with the 
study of the solution of a system 



k m 



of m linear equations in n unknowns y\, . . . , y n , with constant coefficients 
an. The array of coefficients arranged as they occur in (1) has the form 






an ai2 
i a 22 



and is called the coefficient matrix of the system (1). We shall henceforth 
speak of the coefficients a/ and fc in (1) as scalar s and shall derive our theo- 
rems with the understanding that they are constants (with respect to the 
variables y\, . . . , y n ) according to the usual definitions and hence satisfy 
the properties usually assumed in algebra for rational operations. In a later 
chapter we shall make a completely explicit statement about the nature 
of these quantities. 

It is not only true that the concept of a matrix arises as above in the 
study of systems of linear equations, but many matrix properties are ob- 
tainable by observing the effect, on the matrix of a system, of certain natu- 
ral manipulations on the equations themselves with which the reader is 
very familiar. We shall devote this beginning chapter on matrices to that 
study. 

Let us now recall some terminology with which the reader is undoubtedly 
familiar. The line 

(3) Ui = (an, . . . , a in ) 

19 



20 
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of coefficients in the ith equation of (1) occurs in (2) as its ith horizontal 
line. Thus, it is natural to call u* the ith row of the matrix A. Similarly, 
the coefficients of the unknowns y,- in (1) form a vertical line 



(4) 



which we call the jth column of A. 

We may now speak of A as a matrix of m rows and n columns, as an 
ra-rowed and n-columned matrix or, briefly, as an m by n matrix. Then 
the rows of A are 1 by n matrices and its columns are m by 1 matrices. We 
shall speak of the scalars ay as the elements of A, and they may be regarded 
as one by one matrices. The notation a^ which we adopt for the element 
of A in its ith row and jth column will be used consistently, and this usage 
will be of some importance in the clarity of our exposition. To avoid bulky 
displayed equations we shall usually not use the notation (2) for a matrix 
but shall write instead 



(5) 



A = 



(i = 1, 



] j = 1, . . . , n) 



If m = n then A is a square matrix, and we shall speak of A simply as an 
w-rowed square matrix. This, too, is a concept and terminology which we 
shall use very frequently. 

ORAL EXERCISES 
1. Read off the elements an, ai2, 024, a^ in the following matrices 





c) 



4050^ 
-1 -2 -3 
3147 
6 0-1 8/ 



3 2 4 
-l 1 6 -6 



2. Read off the second row and the third column in each of the matrices of Ex. 1. 

3. Read off the systems of equations (1) with constants &, all zero and matrices 
of coefficients as in Ex. 1. 
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2. Submatrices. In solving the system (1) by the usual methods the 
reader is led to study subsystems of s < in equations in certain t < n of 
the unknowns. The corresponding coefficient matrix has s rows and t col- 
umns, and its elements lie in certain s of the rows and t of the columns of A. 
We call such a matrix an s by t submatrix B of A. If s < m and t < n, 
the elements in the remaining m s rows and n t columns form an 
m s by n t submatrix C of A and we shall call C the complementary 
submatrix of B. Clearly, then, B is the complementary submatrix of <7. 

It will be desirable from time to time to regard a matrix as being made 
up of certain of its submatrices. Thus we write 

(6) A = (A) (t = l, ...,s;j=l, ...,*), 

where now the symbols At, themselves represent rectangular matrices. We 
assume that for any fixed i the matrices A n, An, . . . , A it all have the same 
number of rows, and for fixed k the matrices AU, A 2 ;t, . . . , A,* have the 
same number of columns. It is then clear how each row of A is a 1 by t 
matrix whose elements are rows of AH, . . . , AH in adjacent positions and 
similarly for columns. We have thus accomplished what we shall call the 
partitioning (6) of A by what amounts to drawing lines mentally parallel to 
the rows and columns of A and between them and designating the arrays 
of elements in the smallest rectangles so formed by A</. Our principal use 
of (6) will be the use of the case where we shall regard A as a two by two 
matrix 



- (A, A 



whose elements AI, A 2 , As, A 4 are themselves rectangular matrices. Then 
AI and A 2 have the same number of rows, A 3 and A 4 have the same num- 
ber of rows, and every row of A consists partially of a row of AI and of a 
corresponding row of A 2 or of a row of A 3 and a corresponding row of A 4 . 
Note our usage in (2), (5), (6), (7) of the symbol of equality for matrices. 
We shall always mean that two matrices are equal if and only if they are 
identical, that is, have the same size and equal corresponding elements. 

EXERCISES 

1. State how the columns of A of (7) are connected with the columns of AI, A 2, 
A*, and A 4 . 

2. Introduce a notation of an arbitrary six-rowed square matrix A and partition 
A into a three-rowed square matrix whose elements are two-rowed square matrices. 
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Also partition A into a two-rowed square matrix whose elements are three-rowed 
square matrices. 

3. Write out all submatrices of the matrix 

2 -1 3 4 5\ 
1 02-1 -2 1 
01236 
7 3 2 1 

and, if they exist, the complementary submatrices. 

4. Which of the submatrices in Ex, 3 occur in some partitioning of A as a matrix 
of submatrices? 

5. Partition the following matrices so that they become three-rowed square mat- 
rices whose elements are two-rowed square matrices and state the results in the 
notation (6). 



a) 



6. Partition the matrices of Ex. 5 into two-rowed square matrices whose elements 
are three-rowed square matrices. 

7. Partition the matrices of Ex. 5 into the form (7) such that A\ is a two by three 
matrix; a one by six matrix; a two by two matrix. Read off A 2; A 3 , and A 4 and 
state their sizes. 

3. Transposition. The theory of determinants arose in connection with 
the solution of the system (1). The reader will recall that many of the prop- 
erties of determinants were only proved as properties of the rows of a de- 
terminant, and then the corresponding column properties were merely 
stated as results obtained by the process of interchanging rows and columns. 
We call the induced process transposition and define it as follows for mat- 
rices. Let A be an m by n matrix, a notation for which is given by (5), 
and define the matrix 



2 


-1 


3 


4 


1 


2 





1 


1 


-t 


2 


1 


1 


-2 


1 

















1 


1 


-1 


3 








4 


2 


1 











-1 


3 





1, 



1 


-1 
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1 


1 











1 


-2 





2 





3 


1 





2 





2 


-1 


2 








-1 


1 


-4 











-1 


1 





-4| 



(8) 



, w) , 



which we shall call the transpose of A. It is an n by m matrix obtained from 
A by interchanging its rows and columns. Thus, the element a*,- in the ith 
row and jfth column of A occurs in A 7 as the element in its jth row and ith 
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column. Note then that in accordance with our conventions (8) could have 
been written as 

(9) A' = ((a,-,-),-,) (j - 1, . . . , n; t = 1, . . . , m) . 

We also observe the evident theorem which we state simply as 

(10) (AJ = A. 

The operation of transposition may be regarded as the result of a certain 
rigid motion of the matrix which we shall now describe. If A is our m by n 
matrix and m < n we put q = m and write 

(11) A = (A,, A,} , 

where A\ is a g-rowed square matrix. On the other hand, if n < m, we put 
q = n and have 



(12) 



(A l \ 

W' 



where the matrix A\ is again a #-rowed square matrix. The line of elements 
an, a 22 , . . . , a aq of A and hence of A\ is called the principal diagonal of A 
or, simply, the diagonal of A. It is a diagonal of the square matrix Ai and 
is its principal diagonal. We shall call the a,- the diagonal elements of A. 
Notice now that A' is obtained from A by using the diagonal of A as an 
axis for a rigid rotation of A so that each row of A becomes a column of A'. 
We should also observe that if A has been partitioned so that it has the 
form (6) then A 1 is the t by s matrix of matrices given by 

(13) A' = (G/0 (On = A'rfj** 1, ...,*;<= 1, ..., ). 

We have now given some simple concepts in the theory of matrices and 
shall pass on to a study of certain fundamental operations. 

EXERCISES 

1. Let A have the form (7). Give the corresponding notation for A'. Give also 
A' if A is any matrix of Ex. 1 of Section 1. 

2. In Ex. 1 assume that A = A'. What then is the form (7) of A ? Obtain the 
analogous result if A = A', where A is the matrix whose elements are the nega- 
tives of those of A. 

3. Let A be a three-rowed square matrix. Find the form (2) of A if A = A' and 
also if A = A'. 
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4. Prove that the determinant of every three-rowed square matrix A with the 
property that A! = A is zero. 

5. Solve the system (1) with matrix 

/ 1 -2 1\ 
4= -1 3-2 

V 2 -4 3/ 

3 

for 2/1, 2/2, 2/s in terms of fo, & 2 , Jt 8 . Write the results as 2/ == x & A' and thus com- 



pute the matrix B = (&*/). Do this also for the system (1) with matrix A' and com- 
pare the results. 

4. Elementary transformations. The system (1) may be solved by the 
method of elimination, and the reader is familiar with the operations on 
equations Which are permitted in this method and which yield systems 
said to be equivalent to (1). The resulting operations on the rows of the 
matrix A of the system and corresponding operations on the columns of A 
are called elementary transformations on A and will turn out to be very useful 
tools in the theory of matrices. 

The first of our transformations is the result on the rows of A of the inter- 
change of two equations of the defining system. We define this and the 
corresponding column transformation in the 

DEFINITION 1. Let i 5^ r and B be the matrix obtained from A by inter- 
changing its ith and rth rows (columns). Then B is said to be obtained from A 
by an elementary row (column) transformation of type 1. 

The rows (columns) of an m by n matrix are sequences of n (of m) ele- 
ments, and the operations of addition and scalar multiplication (i.e., mul- 
tiplication by a scalar) of such sequences were defined in Section 1.8.* The 
left members of (1) are linear forms. The addition of a scalar multiple of 
one equation of (1) to another results in the addition of a corresponding 
multiple of a corresponding linear form to another and hence to a corre- 
sponding result on the rows of A. Thus we make the following 

DEFINITION 2. 'Let i and r be distinct integers, c be a scalar, and B be me 
matrix obtained by the addition to the ith row (column) of A of the multiple 
by c of its Tth row (column). Then B is said to be obtained from A by an ele- 
mentary row (column) transformation of type 2. 

Our final type of transformation is induced by the multiplication of an 

* We shall use a corresponding notation henceforth when we make references any- 
where in our text to results in previous chapters. Thus, for example, by Section 4.7, 
Theorem 4.8, Lemma 4.9, equation (4.10) we shall mean Section 7, Theorem 8, Lemma 9, 
equation (10) in Chapter IV. However, if the prefix is omitted, as, for example, Theorem 8, 
we shall mean that theorem of the chapter in which the reference is made. 



RECTANGULAR MATRICES 25 

equation of the system (1) by a nonzero scalar a. The restriction a 7* 
is made so that A will be obtainable from B by the like transformation 
for or 1 . Later we shall discuss matrices whose elements are polynomials 
in x and use elementary transformations with polynomial scalars a. We 
shall then evidently require a to be a polynomial with a polynomial inverse 
and hence to be a constant not zero. In view of this fact we shall phrase 
the definition in our present environment so as to be usable in this other 
situation and hence state it as 

DEFINITION 3, Let the scalar a possess an inverse or 1 and the matrix B be 
obtained as the result of the multiplication of the ith row (column) of A by a. 
Then B is said to be obtained from A by an elementary row (column*) trans- 
formation of type 8. 

The fundamental theorems in the theory of matrices are connected with 
the study of the matrices obtained from a given matrix A by the applica- 
tion of a finite sequence of elementary transformations, restricted by the 
particular results desired, to A. Thus, it is of basic importance to study 
first what occurs if we make no restriction whatever on the elementary 
transformations allowed. For convenience in our discussion we first make 
the 

DEFINITION. Let A and Rbembyn matrices and let B be obtainable from A 
by the successive application of finitely many arbitrary elementary transforma- 
tions. Then we shall say that A is rationally equivalent to B and indicate this 
by writing A ^ B. 

We now observe some simple consequences of our definition. First, we 
see that, if A is rationally equivalent to B and B is rationally equivalent 
to C, the combination of the elementary transformations which carry A to B 
with those which carry B to C will carry A to C. Then A is rationally 
equivalent to C. Observe next that every m by n matrix A is rationally 
equivalent to itself. For the elementary transformations of type 2 with 
c = and of type 3 with a = 1 are identical transformations leaving all 
matrices unaltered. 

Finally, we see that if an elementary transformation carries A to B there 
is an inverse transformation of the same type carrying B to A. In fact, the 
inverse of any transformation of type 2 defined for c is that defined for c, 
of type 3 defined by a is that defined for a" 1 , of type 1 is itself. But then A 
is rationally equivalent to B if and only if B is rationally equivalent to A. 

* The reader should verify the fact that, if we apply any elementary row transforma- 
tion to A and then any column transformation to the result, the matrix obtained is the 
same as that which we obtain by applying first the column transformation and then the 
row transformation. 
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Thus we may and shall replace the terminology A is rationally equivalent 
to B in the definition above by A and B are rationally equivalent. 

We have now shown that in order to prove that A and B are rationally 
equivalent it suffices to prove A and B both rationally equivalent to the 
same matrix C. As a tool in such proofs we then prove the following 

LEMMA 1. Let r < m, s < n, and A and B be m by n matrices of the form 

A, \ _ /B, 



for T by 8 rationally equivalent matrices AI and BI and m r by n s ra- 
tionally equivalent matrices A 2 and B 2 . Then A and B are rationally equivalent. 
For it is clear that any elementary transformation on the first r rows and s 
columns of A induces a corresponding transformation on A\ and leaves A* 
and the zero matrices bordering it above unaltered. Clearly the sequence 
of such transformations induced by the transformations carrying A\ to B\ 
will replace A by the matrix 

A - < Bl M 
A -\0 A*)' 

We similarly follow this sequence of elementary transformations by ele- 
mentary transformations on the last m r rows and n s columns of A 
which carry A 2 to B 2 and obtain B. 

It is important also to observe that we may arbitrarily permute the rows 
of A by a sequence of elementary row transformations of type 2, and simi- 
larly we may permute its columns. For any permutation results from some 
properly chosen sequence of interchanges. 

Before continuing further with the study of rational equivalence we shall 
introduce the familiar properties of determinants in the language of matrix 
theory and shall also define some important special types of matrices. We 
shall then discuss another result used for the types of proofs mentioned 
above. 

5. Determinants. Let B be the square matrix 
(14) 

The corresponding symbol 

..... bi 

(15) D 
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is called a t-rowed determinant or determinant of order t. It is defined as the 
sum of the t! terms of the form 

(16) (-l)*^*,...^,, 

where the sequence of subscripts ii, . . . , it ranges over all permutations 
of 1, 2, . . . , t and the permutation 1*1, . . . , it may be carried into 1,2,..., 
t by i interchanges. That the sign ( 1)* is unique is proved in L. E. Dick- 
son's First Course in the Theory of Equations, and we shall assume this re- 
sult as well as all the consequent properties of determinants derived there. 
The determinant D will be spoken of here as the determinant of the matrix 
B and we shall indicate this by writing 

(17) D - |fi| 

(read D equals determinant 5). Nonsquare matrices A do not have de- 
terminants, but their square submatrices have determinants called the 
minors of A. If A is a square matrix of n > t rows, the complementary sub- 
matrix of any t-rowed square submatrix B is an (n t) -rowed square ma- 
trix whose determinant and that of B are minors of A called complementary 
minors. In particular, every element a,-/ of a matrix A defines a one-rowed 
square submatrix of A whose determinant is the element itself. Thus we 
have seen that the elements of a matrix may be regarded either as its one-rowed 
square submatrices or as its one-rowed minors. We now pass to a statement 
of some of the most important results on determinants. 

The result on the interchange of rows and columns of determinants men- 
tioned in Section 3 may now be stated as 

LEMMA 2. Let Abe a square matrix. Then 

(18) ' |A'| = |A|. 

The next three properties of determinants are those frequently used in 
the computation of determinants, and we shall state them now in the lan- 
guage we have just introduced. 

LEMMA 3. Let B be the matrix obtained from a square matrix A by an ele- 
mentary transformation of type 1 . Then |B| = |A|. 

LEMMA 4. Let B be the matrix obtained from a square matrix A. by an ele- 
mentary transformation of type 2. Then | B | = | A | . 

LEMMA 5. Let B be the matrix obtained from a square matrix A by an ele- 
mentary transformation of 'type 3 defined for a scalar a. Then |B| =a |A|. 

The reader will recall that Lemma 3 may be used in a simple fashion 
to obtain 
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LEMMA 6. If a square matrix has two equal rows or columns, its determi- 
nant is zero. 

Another result of this type is 

LEMMA 7. // a square matrix has a zero row or column, its determinant is 
zero. 

Finally, we have 

LEMMA 8. Let A, B, C be n-rowed square matrices such that the ith row 
(column) of C is the sum of the ith row (column) of A and that of B while all 
other rows (columns) of B and C are the same as the corresponding rows (col- 
umns) of A. Then 

(19) |C| = |A| + |B|. 

There are, of course, many other properties of determinants, and of these 
we shall use only very few. Those we shall use are, of course, also well 
known to the reader. Of particular importance is that result which might 
be used to define determinants by an induction on order and which does 
yield the actual process ordinarily used in the expansion of a determinant. 
We let A be an n-rowed square matrix A = (o</) and define AH to be the 
complementary minor of aj. Then the result we refer to states that if we 
define c/* = ( l) i+) 'da then 



(20) | A | = aikCki = c ik a ki (i, j, = 1, . . . , n) . 



Thus, the determinant of A is obtainable as the sum of the products of the 
elements a/ in any row (column) of A by their cofactors c/i, that is, the 
properly signed and labeled minors ( l) <4 " J 'd t -/. 

The result (20) is of fundamental importance in our theory of matrices 
and will be applied presently together with the following parallel result. 
Let B be the matrix obtained from a square matrix A by replacing the tth 
row of A by its qth row. Then B has two equal rows and by Lemma 6 
| B | =0. We expand B as above according to the elements of its ith row 
and obtain as its vanishing determinant the sum of the products of all ele- 
ments in the qih row of A by the cofactors of the elements in the ith row 
of A. Combining this result with the corresponding property about col- 
umns we have 



(21) a ik c kq = c 8k a k j = 

*~1 k=l 

(i T* q, s ^ j; i, j, q, 8 = 1, . . . , n) . 
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The equations (20) and (21) exhibit certain relations between the arbitrary 
square matrix A and the matrix we define as 



(22) 



adj A = 



(i, j = 1, . . . , n) . 



These relations will have important later consequences. We call the matrix 
(22) the adjoint of A and see that if A = (an) is an n-rowed square matrix 
its adjoint is the n-rowed square matrix with the cofactor of the element 
which appears in the jth row and ith column of A as the element in its 
own tth row and jth column. Clearly, if A = then adj A = 0. 

EXERCISES 
1. Compute the adjoint of each of the matrices 




2. Expand the determinants below and verify the following instances of Lem- 
ma 8: 



a) 



32-1 
1 2 
0-1 3 

1 -1 4 
1 -1 1 
234 



3 -3 -1 
1 -1 
003 

-1 1 -3 
1 -1 1 
234 



3 -1 -1 
1 1 
0-1 3 

1 

1 -1 1 
234 



6. Special matrices. There are certain square matrices which have spe- 
cial forms but which occur so frequently in the theory of matrices that they 
have been given special names. The most general of these is the triangular 
matrix, that is, a square matrix having the property that either all its ele- 
ments to the right or all to the left of its diagonal are zero. Thus a square 
matrix A = (a,-/) is triangular if it is true that either a,, = for all j > i or 
that an = for all j < i. It is clear that A is triangular if and only if A 7 is 
triangular; and, moreover, we have 

Theorem 1. The determinant of a triangular matrix is the product ana22 
. . . a nn of its diagonal elements. 

The result above is clearly true if n = 1 so that A = (an), | A \ = an. 
We assume it true for square matrices of order n 1 and complete our 
induction by expanding \A \ according to the elements of its first row or 
first column in the respective cases above. 

A matrix A = (a,-/) is called a diagonal matrix if it is a square matrix, 
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and a/ = for all i ^ j. Clearly, a diagonal matrix A is triangular so 
that its determinant is the product of its diagonal elements. 

If all the diagonal elements of a diagonal matrix are equal, we call the 
matrix a scalar matrix and have \A \ = aj l4 The scalar matrix for which 
an = 1 is called the n-rowed identity matrix and will usually be designated 
by 7. If there be some question as to the order of I or if we are discussing 
several identity matrices of different orders, we shall indicate the order by 
a subscript and thus shall write either 7 n or I as is convenient for the n- 
rowed identity matrix. 

Any scalar matrix may be indicated by 

(23) of, 

where a = an is the common value of the diagonal elements of the matrix. 
We shall discuss the implications of this notation later, 

It is natural to call any m by n matrix all of whose elements are zeros 
a zero matrix. In any discussion of matrices we shall use the notation to 
represent not only the number zero but any zero matrix. The reader will 
find that this usage will cause neither difficulty nor confusion. 

We shall frequently feel it desirable to consider square matrices of either 
of the forms 

Al A - 

A. - 



where A! is a square matrix. Then (24) implies that A 4 is necessarily square, 
and the reader should verify the fact that the Laplace expansion of de- 
terminants implies that 

|A| - |A | = |Ai| -|A 4 |. 

The property above and that of Theorem 1 are special instances of a 
more general situation. We let A be a square matrix and partition it as in 
(6) with s = t and the submatrices An all square matrices. Then the La- 
place expansion clearly implies that if all the A/ are zero matrices for either 
all i > j or all i <j, then |A| = |An| . . . |A|. Evidently Theorem 1 
is the case where the A, are one-rowed square matrices and the result con- 
sidered in (24) the case where t = 2. 

In connection with the discussion just completed we shall define a nota- 
tion which is quite useful. Let A be a square matrix partitioned as in (6) 
and suppose that s = t } the An are all square matrices, and every A,-/ = 



RECTANGULAR MATRICES 31 

for i 7* j. Then A is composed of zero matrices and matrices Ai = An 
which are what we may call its diagonal blocks, and we shall indicate this 
by writing 

(25) A = diag{Ai,...,A,}. 

As above, the determinant of A is the product of the determinants of its 
submatrices Ai, . . . , At. 

In closing we note the following result which we referred to at the close 
of Section 4. 

LEMMA 9. Every nonzero m by n matrix A is rationally equivalent to a 
matrix 



(26) 



/I 0\ 

\o or 



where I is an identity matrix. 

For by elementary transformations of type 1 we may carry any element 
a pq 7* of A into the element 6u of a matrix B which is rationally equivalent 
to A. By an elementary transformation of type 3 defined for a = frji 1 we 
replace B by a rationally equivalent matrix C with c\\ = 1. We then apply 
elementary row transformations of type 2 with c = c r \ to replace C by 
the rationally equivalent matrix D = (d< ? ) such that du = 1, d r \ = for 
r > 1, and then use elementary column transformations of type 2 with 
c = dir to replace D by the matrix 



Now Ai is an m 1 by n 1 matrix, and I\ is the identity matrix of one 
row. Clearly, A and A are rationally equivalent. Moreover, either AI = 
and we have (26) for / = /i, or our proof shows that AI is rationally equiv- 
alent to a matrix 



- ( 







But then, by Lemma 1, A is rationally equivalent to a matrix 

t 
(29) 
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After finitely many such steps we obtain (26). 

We shall show later that the number of rows in the matrix I of (26) is 
uniquely determined by A. 

EXERCISES 

1. Carry the following matrices into rationally equivalent matrices of the 
form (26) 

/ 3 5 10 4\ /I 1 2\ 

a) I 1 211 6) (2 3 5] 

\-l -1 - 2 O/ \1 2 3/ 

(1 1 2\ 
2023 
0000 
i o i iy 

2. Apply elementary row transformations only arid carry each of the following 
matrices into a matrix of the form (26) 

10 10^ 
22-50 
-21-52 
00 ly 

3. Apply elementary column transformations only and carry each of the follow- 
ing matrices into a matrix of the form (26). 

/ 3 2 - x \ /! 1 

a) 2 3-2 6) , 

\5 4 -27 \ J 







. / 1 3 -1\ /O 1 2\ 

c) V-l 2 4 5/ d) VO 5 1 3J 



4. Show that if the determinant of a square matrix A is not zero then A can be 
carried into the identity matrix by elementary row* transformations alone. Hint: 
The property \A\ ^ is preserved by elementary transformations. Some element 
in the first column of A must not be zero, and by row transformations we may 
carry A into a matrix with ones on the diagonal and zeros below and then into /. 

7. Rational equivalence of rectangular matrices. The largest order of any 
nonvanishing minor of a rectangular matrix A is called the rank of A. The 
result of (20) states that every (t + l)-rowed minor of A is a sum of nu- 

* Not every matrix may be carried into the form (26) by row transformations only, 
e.g., take A = (11). 
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merical multiples of its Crowed minors, and if the latter are all zero so are 
the former. We thus clearly have 

LEMMA 10. Let all (r + l)-rowed minors of A vanish. Then the rank of A 
is at most r. 

We may also state this result as 

LEMMA 11. Let A have a nonzero r-rowed minor and let all (r + l)-rowed 
minors of A vanish. Then r is the rank of A. 

Note that we are assigning zero as the rank of the zero matrix and that 
the rank of any nonzero matrix is at least 1. 

The problem of computing the rank of a matrix A would seem from our 
definition and lemmas to involve as a minimum requirement the computa- 
tion of at least one r-rowed minor of A and all (r + l)-rowed minors. The 
number of determinants to be computed would then normally be rather 
large, and the computations themselves generally quite complicated. How- 
ever, the problem may be tremendously simplified by the application of 
elementary transformations. We are thus led to study the effect of such 
transformations on the rank of a matrix. 

Let then A result from the application of an elementary row transforma- 
tion of either type 1 or type 3 to A. By Lemmas 3 and 5 every Crowed 
minor of A is the product by a nonzero scalar of a uniquely corresponding 
Crowed minor of A, and it follows that A and Ao have the same rank. If 
Ao results when we add to the ith row of A the product by c 7^ of its 0th 
row and B is a Crowed square submatrix of A, the correspondingly placed 
submatrix B of A is equal to B if no row of B is a part of the ith row of A. 
If, however, a row erf B is in the ith row of A and a row of B is in the 0th 
row of B, then by Lemma 2.4 we have | B 1 = I B \ . If, finally, a row of B 
is in the ith row of A but no row is in the 0th row of A, then by Lemma 8 
| Bo 1 = I B | + c | C | , where C is a Crowed square matrix all but one of 
whose rows coincide with those of B, and this remaining row is obtained by 
replacing the elements of B in the ith row of A by the correspondingly 
columned elements in its 0th row. But then it is easy to see that | C \ is 
a minor of A as well as of AO. If A has rank r we put t = r + 1 and see 
that |B | = |C| =0, | Bo | = Of or every (r + 1) -rowed minor |B | of A . 
Also there exists an r-rowed minor | B | 5* in A, and our proof shows the 
existence of a corresponding minor | B 1 = | B | or | B 1 = | B | + c \ C \ in 
Ao. But then |B | =0 implies that |C| = -cr l \B\ ^ 0, and A has a 
nonzero r-rowed minor |C|, A and A have the same rank. 

We observe, finally, that if an elementary row transformation be ap- 
plied to the transpose A' of A to obtain Ai and the corresponding column 
transformation be applied to A to obtain Ao, then AJ = A\. By the above 
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proof A' and AI have the same rank, by Lemma 2 A and A! have the same 
minors and hence the same rank, so that A\ and A( = AQ have the same 
rank. Hence, A and A have the same rank. Thus, any two rationally 
equivalent matrices have the same rank. 

Conversely, let the rank r of two m by n matrices A and B be the same 
and use Lemma 9 to carry A into a rationally equivalent matrix (26). It 
is clear that the rank of the matrix in (26) is the number of rows in the ma- 
trix /. By the above proof A and this matrix have the same rank, I = I r , 
A is rationally equivalent to 



< (o' I) 



Similarly, B is rationally equivalent to (30) and to A. We have thus proved 
what we regard as the principal result of this chapter. 

Theorem 2. Two m by n matrices are rationally equivalent if and only if 
they have the same rank, 

We have also the consequent 

COROLLARY. Every m by n matrix of rank r is rationally equivalent to an 
m by n matrix (SO). 

A matrix is called nonsingular if it is a square matrix and its determinant 
is not zero. But then Theorem 2 implies, as in Ex. 4 of Section 6, 

Theorem 3. Every u-rowed nonsingular matrix is rationally equivalent to 
the n-rowed identity matrix. 

In closing let us observe a result of the application to a matrix of either 
row transformations only or column transformations only. We shall prove 

Theorem 4. Every m by n matrix of rank r > may be carried into a 
matrix of the forms 



(31) , (H 0), 



respectively, by a sequence of elementary row or column transformations only, 
where G is an T-rowed matrix, H is an T-columned matrix, and both G and H 
have rank r. 

For A is equivalent to (30) by a sequence of elementary row and column 
transformations. Clearly, we may obtain (30) by first applying all the row 
transformations and then all the column transformations. If we then apply 
the inverses of the column transformations in reverse order to (30), we ob- 
tain the result of the application of the row transformations alone to A. 
But column transformations applied to (30) clearly carry this matrix into 
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a matrix of the form given by the first matrix of (31). Moreover, it is evi- 
dent that the rank of this matrix is that of (?, G has rank r. The result for 
column transformations is obtained similarly. 

EXERCISES 

1. Compute the rank r of the following matrices by using elementary transforma- 
tions to carry each into a matrix with all but r rows (or columns) of zeros and an 
obvious r-rowed nonzero minor. 

/I 1 3\ /2 - 1 4 

a) ( 1 5 2 6) ( 1 3-2 

\1 5 -I/ \1 -11 14 

/' - 3 4 
c) 4 -12 16 



2. Carry the first of each of the following pairs of matrices into the second by 
elementary transformations. Hint: If necessary carry A and B into the form (30) 
and then apply the inverses of those transformations which carry B into (30) to 
carry (30) into B. 

/2 -1 3\ _ /I 




A = ' 

n -2 i\ /2 - 

6)A = (l -221, J5=(60 3 

\2 -4 I/ \1 1 

/I -2 1\ /I 1 

c) A = ( 2 -4 2 1 , B=ll 

\3 -6 3/ VO 0, 



CHAPTER III 
EQUIVALENCE OF MATRICES AND OF FORMS 

1. Multiplication of matrices. If xi, . . . , x m and t/i, . . . , y n are variables 
related by a system of linear equations 



(1) Xi = atMi (i = 1, . . . , m) , 



this system was said in Section 1.9 to define a linear mapping carrying the 
Xi to the y,-. We call the m by n matrix A = (a^) the matrix of the map- 
ping (1). 

Suppose now that 21, . . . , z q are variables related to y\, . . . , y n by a 
second linear mapping 

q 
(2) yi = ^b ik z k (j = 1, . . . , n) , 

with n by q matrix B = (bj k ), carrying the y; to the z k . Then, if we sub- 
stitute (2) in (1) we obtain a third linear mapping 



(3) Xi = 2*c ik z k (i = 1, . . . , m) , 
with m by q matrix C = (CM), and it is easily verified by substitution that 

(4) c ik - _ 

(i = 1, . . . , ra; k = 1, . . . , q) . 

The linear mapping (3) is usually called the product of the mappings (1) 
and (2), and we shall also write C = AB and call the matrix C the product 
of the matrix A by the matrix B. 

We have now defined the product AB of an m by n matrix A and an 
n by q matrix B to be a certain m by q matrix C. Moreover, we have de- 
fined*C so that the element dk in its ith row and fcth column is obtained 

36 
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as the sum dk = an&i* + 
elements ay in the ith row 

(5) 



+ . . . + o>i n b n k of the products of the 



of Ay by the corresponding elements ?>,* in the fcth column 



(6) 



61* 

62* 



of B. Thus we have stated what we shall speak of as the row by column rule 
for multiplying matrices. 

Observe that if either A or B is a zero matrix the product AB is also. 
Moreover, if A is an m by n matrix, then we have 



(7) 



I m A = AI n = A , 



where 7 r represents the r-rowed identity matrix defined for every r as in 
Section 2.6. Observe also that I m is the matrix of the linear transforma- 
tion Xi = yi for i = 1, . . . , w, and thus in this case the product of (1) 
by (2) is immediately (2) ; hence (7) is trivially true. 

We have not defined and shall not define the product AB of two matrices 
in which the number of columns in A is not the same as the number of rows 
in B. Then, evidently, the fact that AB is defined need not imply that BA 
is defined. But when both are defined, they are generally not equal and 
may not even be matrices of the same size. This latter fact is clearly so if, 
for example, A is m by n, B is n by m, and m^n. 

If A and B are n-rowed square matrices, we shall say that A and B are 
commutative if AB = BA. Note the examples of noncommutative square 
matrices in the exercises below. 

Finally, let us observe the following 

Theorem 1. The transpose of a product of two matrices is the product of 
their transposes in reverse order. 

In symbols we state this result as 



(8) 



(AB)' = B'A' . 



Here A is an m by n matrix, B is an n by q matrix, (AB)' is a q by in ma- 
trix, which we state is the product of the q by n matrix B' and the n by m 
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matrix A f . We leave the direct application of the row by column rule to 
prove this result as an exercise for the reader. 

EXERCISES 

1. Compute (3) and hence the product C = AB for the following linear changes 
of variables (mappings) and afterward compute C by the row by column rule. 

I I Vl == "2*1 ~~~ ^"2 i~ "^"8 

Ixi = 2y l + 32/2 -2/s [2/1= -ISai + 26z 2 + 13* 8 

Us = 2/2 - 92/3 1 2/a = *i 2z 2 - s 3 

2. Compute the following matrix products AB. Compute also BA in the cases 
where the latter product is defined. 

/4 3 2\ /-I -2 -3^ 
a) 3 2 1 1 3 
\2 1 -I/ \~1 -2 4^ 

1 - 

2 "I/" - * ^t ,v | 9 j/9 1 o 

3illi t A f\ i ci; | ^ n* io 



o - 





3. Let the symbol EH represent the three-rowed square matrix with unity in the 
ith row and jth column and zeros elsewhere. Verify by explicit computation that 
EijEjk = Eik and that if j ?* g then EijE ok = 0. 

2. The associative law. It is important to know that matrix multiplica- 
tion has the property 

(9) (AB}C = A(BC) 

for every mby n matrix A, n by q matrix B, q by s matrix C. This result 
is known as the associative law for matrix multiplication and it may be 
shown as a consequence that no matter how we group the factors in forming 
a product A\ . . . At the result is the same. In particular, the powers A 6 of 
any square matrix are unique. We shall assume these two consequences of 
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(9) without further proof and refer the reader to treatises on the founda- 
tions of mathematics for general discussions of such questions. 

To prove (9) we write A = (a,-/), B = (ft/*), C = (c k i), where in all 
cases i = 1, . . . , m; j = 1, . . . , n; k = 1, . . . , q; I = 1, . . . , s. Then it 
is clear that the element in the ith row and Ith column of G = (AB)C is 




(10) gn 

while that in the same position in H = A(BC) is 

(ii) hu 




y = i 

Each of these expressions is a sum of nq terms which are respectively of the 
form (dijbjk^Cki and a^bjkCki). But those terms in the respective sums with 
the same sets of subscripts are equal, since we have already assumed that 
the elements of our matrices satisfy the associative law a(bc) = (ab)c. 
Hence, gu hu for all i and Z, G = H, and (9) is proved. 

EXERCISES 

Compute the products (AB)C and A(BC) in the following cases. 

/2 -1 3\ 

o) A = 3 12, B = 

\0 -1 I/ 



2 ~3\ / / 2 -1 1 




^-a.! 



c) A(l 2 -1 3), 



3. Products by diagonal and scalar matrices. Let A = (a/) be an m by n 
matrix and B be a diagonal matrix, and designate the ith diagonal element 
of B by &<. Then our definition of product implies that if B is ra-rowed so 
that BA is defined, then , 

(12) ' BA = (btftf) 

(i = 1, . . . , m\j = 1, . . . , n) ; 
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while if B is n-rowed, then 

(13) AB = (aijbj) 

(i = 1, . . . , ra; j = 1, . . . , n) . 

Thus the product of a matrix A by a diagonal matrix B on the left is ob- 
tained as the result of multiplying the rows of A in turn by the correspond- 
ing diagonal elements of B; the product of A by a diagonal matrix on the 
right is the result of multiplying the columns of A in turn by the corre- 
sponding diagonal elements of B. 

Now, let m = n, A be a square matrix. Then from (12) and (13) we see 
that AB = BA if and only if 

(14) (&< - bj)av = (i, j = 1, . . . , n) . 

As an immediate consequence of the case bi = . . . = b n we have the very 
simple 

Theorem 2. Every n-rowed scalar matrix is commutative with all n-rowed 
square matrices. 

We next see that, if i 7* j and & 5^ &/, then (14) implies that a, = 0. 
This gives the result we shall state as 

Theorem 3. Let the diagonal elements of an n-rowed diagonal matrix B be 
all distinct. Then the only n-rowed square matrices commutative with B are 
the n-rowed diagonal matrices. 

We may now prove the converse of Theorem 2 a result which is the 
inspiration of the name scalar matrix. 

Theorem 4. The only n-rowed square matrices which are commutative with 
every n-rowed square matrix are the scalar matrices. 

For let 

(15) B = (ba) (i,j, = l,...,n) 

and suppose that B is commutative with every n-rowed square matrix A. 
We shall select A in various ways to obtain our theorem. First, we let E 3 be 
the diagonal matrix with unity in its jth row and column and zeros else- 
where and put BEj = EjB. Equations (12) and (13) imply that the jth 
row of EjB is the same as that of B and the jth column of BEj is the 
same as that of 5, while all other columns of BEj are zero. Thus if i ^ j, 
the elements in the ith column of EjB must be zero. Since 6,-< is in the 
ith column, we have 6/ = for j j* i, and 5 is a diagonal matrix. If D ; 
is the matrix with 1 in its first row and jth column and zeros elsewhere, the 
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product BDj has bu in its first row and jth column and is equal to the 
matrix DjB which has 6/,-in this same place. Hence, fry, = bn = 6 for j = 2, 
. . . , n, and B is the scalar matrix 6/ n . 

Let us now observe that if A is any m by n matrix and a is any scalar, 
then 

(16) (a! m )A = A(al n ) 

is an m by n matrix whose element in the ith row and jth column is the 
product by a of the corresponding element of A. This is then a type of 
product 

(17) aA = Aa 

like that defined in Chapter I for sequences, and we shall call such a product 
the scalar product of a by A. However, we have defined (17) as the in- 
stances (16) of our matrix product (4). 

EXERCISES 

1. Compute the products AB and BA by the use of (12) and (13) as well as the 
row by column rule if 

71 0\ 
a) A = -2 , 
\0 37 

71 0\ 

6) A = -1 , B = 

\0 07 

(2 Ov /O -1 0\ 

02 01 _ / 0-1 

00-2 or ** ~ 1 1 o o o 

000 -2/ \0 1 

73 0\ 

d) A - 2 , 
\0 17 

2. Find all three-rowed square matrices B such that BA = AB\i 

/I 0\ 
o) 4 = -1 b) A 
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3. Prove by direct multiplication that BA = AB if i 2 = 1 and 



4. Let w be a primitive cube root of unity. Prove that BA wAB if 





5. Show that the matrix B of Ex. 4 has the property B 3 = al and that the 
matrix A has the property A 3 = 7. Obtain similar results for the matrices of Ex. 3. 

6. Compute BAB if 

ic -d\ n /O -1\ 

Ha c)' Hi o)' 

where c and d are any ordinary complex numbers, c and 3 are their conjugates. 

4. Elementary transformation matrices. We shall show that the matrix 
which is the result of the application of any elementary row (column) trans- 
formation to an m by n matrix A is a corresponding product EA (the prod- 
uct AN) where E is a uniquely determined square matrix. We might of 
course give a formula for each E and verify the statement, but it is simpler 
to describe E and to obtain our results by a device which is a consequence 
of the following 

Theorem 5. Let A be an m by n matrix, B be an n by q matrix so that 
C = AB is an m by q matrix. Apply an elementary row (column) transforma- 
tion to A (to B) resulting in what we shall designate by Ao(by B (0) ), and then 
the same elementary transformation to C resulting in Co (in C (0) ). Then 

(18) Co = A B , C<> = AB<> . 

For proof we see that if we replace a^ by a,,- in the right members of (4) 
we get c,*. Thus we obtain the result of an elementary row transformation 
of type 1 on AB by applying it to A. Our definitions also imply that for 
elementary row transformations of type 3 the result stated as (18) follows 
from 



(19) 

;! ;'1 

(i = 1, . . . , m; k = 1, . . . , q) . 
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Finally, we see that for type 2 they follow from 

n / n \ / n \ 

(20) 2} (a*, + ca t j)b 3k = I ^a^b/^ | + c I ^^A* 1 = c ik + cc 9k 
y-i \y-i / \y-i / 

(i = 1, . . . , n; k = 1, . . . , q) , 

and we have proved the first equation in (18). The corresponding column 
result C (0) = AB (Q) has an obvious parallel proof or may be thought of as 
being obtained by the process of transposition. It is surely unnecessary to 
supply further details. 

We now write A = I A where / is the ra-rowed identity matrix and apply 
Theorem 5 to obtain A = /<>A, in which 7 is the matrix called E above. 
Then we see that to apply any elementary row transformation to A we 
simply multiply A on the left by either EH, P;/(c), or R<(a). Here we define 
EH to be the matrix obtained by interchanging the ith and jth rows of 
I, P*,(c) by adding c times the jth row of I to its Oh row, Ri(a) by multiply- 
ing the -ith row of / by a ? 0. We shall call EH, P,-(c), and Ri(a) elementary 
transformation matrices of types 1, 2, and 3, respectively. 

Observe that 

(21) Eh = E,< = EH , E i3 En = / , 

so that, if we now assume that EH is n-rowed, the product AE^ is the result 
of an elementary column transformation of type 1 on A. Similarly, 

(22) (PH(C)]' = P,,(c) , P(-c)P w (c) = P(c)P(-c) = / , 

and if P,-(c) is n-rowed, then APa(c) is the result of an elementary column 
transformation of type 2 on A. Finally, 

(23) [Ri(a)Y = Rt(a) , R^a^R^a) = E i (a)^ i (a~ 1 ) = / , 

and if Ri(a) is n-rowed, then ARi(a) is the result of an elementary column 
transformation of type 3 on A. Thus the elementary column transforma- 
tions give rise to exactly the same set of elementary transformation mat- 
rices as were obtained from the row transformations. 

We shall now interpret the results of Section 2.7 in terms of elementary 
transformation matrices. First of all, we may interpret Theorem 2.2 as the 
following 
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LEMMA 1. Let A and B be m by n matrices. Then there exist elementary 
transformation matrices PI, . . . , P 8 , Qi, . . . , Qt such that 

B = (?! , , . P.)A(Qi . . . Q t ) 

if and only if A and B have the same rank. 

Theorem 2.3 is the case of Lemma l v where A is the n-rowed identity 
matrix, and consequently B = PI . . . P,Qi Qt. Thus we obtain 

LEMMA 2. Every nonsingular matrix is a product of elementary transforma- 
tion matrices. 

We shall close the results with an important consequence of Theorem 2.4. 

Theorem 6. The rank of a product of two matrices does not exceed the rank 
of either factor. 

For, by Theorem 2.4, if the rank of A is r, there exists a sequence of ele- 
mentary transformations carrying A into an m by n matrix A Q whose bot- 
tom m r rows are all zero. By Theorem 5, if we apply these transforma- 
tions to C = AB, we obtain a rationally equivalent matrix Co = A Q B. Then 
the bottom m r rows of the w-rowed matrix C are all zero, and the 
rank of C is the rank of C and is at most r, as desired. The corresponding 
result on the relation between the ranks of C and B is obtained similarly. 

EXERCISES 
1. Express the following as products of elementary transformation matrices. 



^ 
0) 



2 l 




2. Find the elementary transformation jjrtatjafes corresponding to the elementary 
row transformations used in Ex. 2 of Sectfoi^fcfrand carry the matrices of that exer- 
cise into the form (2.30) by matrix multiplication. 

6. The determinant of a product. In the theory of determinants it is 
shown that the symbol for the product of two determinants may be com- 
puted as the row by column product of the symbols for its factors. In our 
present terminology this result may be stated as 

Theorem 7. The determinant of a product of two square matrices is the 
product of the determinants of the factors, that is, 

(24) |AB| = |A|- |B|. 

The usual proof in determinant theory of the result is quite complicated, 
and it is interesting to note that it is possible to derive the theorem as a 
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simple consequence of our theorems which were obtained independently 
and which we should have wished to derive even had Theorem 7 been as- 
sumed. We shall, therefore, give such a derivation. We thus let A and B 
be n-rowed square matrices and see that Theorem 6 states that if | A \ = 
then | AB \ = 0. Hence, let A be nonsingular so that, by Lemma 2, 

A = PL..P., 

where the Pi are elementary transformation matrices. From our defini- 
tions we see that, if E is an n-rowed elementary transformation matrix of 
type 1, 2, or 3, then \E\ = 1, 1, or a, respectively. Thus, if G is an 
n-rowed square matrix and <7 = EG, then Lemmas 2.3, 2.4, and 2.5 imply 
that |(? | = -\G\, | G|,a|G |, respectively, and hence |G | = \E\ |G|. 
It follows clearly that, if Ei, . . . , E t are any elementary transformation 
matrices, then |M_i . . . EiO\ = \E t \ . . . \Ei\ |G|. We apply this re- 
sult first to A to obtain \A\ = |Pi| ... |P| and then to A B to obtain 
\AB\ = |Pi. ..P,| = |Pi| . . . |P. | \B\ = \A\ |J8| as desired. 

6. Nonsingular matrices. An n-rowed square matrix A is said to have an 
inverse if there exists a matrix B such that AB = BA = 7 is the n-rowed 
identity matrix. Clearly, B is an n-rowed square matrix which we shall 
designate by A* 1 and thus write 

(25) AA~ l = A- 1 A = I . 



Moreover, we have 

Theorem 8. A square ?>8j^m^A has an inverse if and only if A is non- 
singular. ^ 

For if (25) holds, we apply Theorem 7 to obtain \A \ JA- 1 ! = |/| = 1, 
| A | ^ 0. The converse may be shown to follow from (21), (22), (23), and 
Lemma 2; but we shall prove instead the result that if | A \ 5^ then A" 1 is 
uniquely determined as the matrix 



(26) A-* 

This formula follows from the matrix equation 

(27) A(adj A) = (adj A) A = |A| I 



by multiplication by the scalar \A\~ l , and we observe that (27) is the in- 
terpretation of (2.20) and (2,21) in terms of matrix product, where adj A is 
our symbol for the adjoint matrix defined in (2.22). 
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We now prove A~~ l unique by showing that if either AB = / or BA = / 
then B is necessarily the matrix A" 1 of (26). This is clear since in either 
case \A\ ^ 0, A~ l of (26) exists, A" 1 = A^I = A~ l (AB) = (A~ 1 A)B = 
IB = B, and similarly if BA = /. 

We note also that, if A and B are nonsingular, 

(28) (AB)-* = B-+A-* . 



For (AB)(B- l A~ l ) = A(BB~ l )A~ l = A A" 1 = /. Finally, if A is nonsingu- 
lar we have 

(29) (A-O'^A')- 1 . 

For /' = I = (AA- 1 )' = (A-yA', (A- 1 )' is the inverse of A'. 

A linear mapping (1) with m = n has an inverse mapping (2) with 
n = q if the product (3) is the identical mapping, that is, if C = AB is the 
identity matrix. But this is then possible if and only if | A | 5^ 0, that is, 
(1) is what we called a nonsingular linear mapping in Section 1.9. We shall 
use this concept later in studying the equivalence of forms. 

EXERCISES 

1. Show that if A is an n-rowed square matrix of rank n 1 ?* 0, then adj A 
has rank 1. Hint: By (27) we have A(adj A) = 0, PAQ(Q~ 1 adj A) = for 



Then the first n 1 rows of Q" 1 adj A must be zero, adj A has rank 1. 

2. Use the result above and (28) to prove that adj (adj A) = | A | n ~ 2 A if n > 2 
and adj (adj A) = A if n 2 and | A \ ^ 0. 

3. Use Ex. 1 and (27) to show that |adj A| = lA]^ 1 . 

4. Compute the inverses of the following matrices by the use of (26). 



fb 


bc- 




n /O 2\ 
) 6) (1 


73-2 0\ 
c) 3-2 


a 


c 




7 \0 1 O/ 


\- 


-2 





3/ 


/2 
1 


3 
2 


4^ 
6 


\ / l 1 ! \ 
e) 1 2 3 


4 


-1 
1 



1 
2 


2 
-1 
1 


vo 





ly 


' \1 3 6/ 


\3 


-2 


1 


6 




/ 


2 


-1 1 6\ 


/ 2 -5 


2 


-3^ 




9) 


- 



1 


!?-i >! 


-1 -3 
1 X 


3 
-1 


-1 







\ 


1 


2 3/ 


\-l 1 





ly 
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5. Let A be the matrix of (d) above and B be the matrix of (e). Compute C = 
- 1 by (28) and verify that (AE)C = I by direct multiplication. 



7. Equivalence of rectangular matrices. Our considerations thus far have 
been devised as relatively simple steps toward a goal which we may now 
attain. We first make the 

DEFINITION. Two m by n matrices A and B are called equivalent if there 
exist nonsingular matrices P and Q such that 

(30) PAQ = B . 

Observe that P and Q are necessarily square matrices of m and n rows, 
respectively. By Lemma 2 both P and Q are products of elementary trans- 
formation matrices and therefore A and B are equivalent if and only if A and 
B are rationally equivalent. The reader should notice that the definition of 
rational equivalence is to be regarded here as simply another form of the 
definition of equivalence given above and, while the previous definition is 
more useful for proofs, that above is the one which has always been given 
in previous expositions of matrix theory. We may now apply Lemma 1 and 
have the principal result of the present chapter. 

Theorem 9. Two m by n matrices are equivalent if and only if they have 
the same rank. 

We emphasize in closing that, if A and B are equivalent, the proof of 
Theorem 2.2 shows that the elements of P and Q in (30) may be taken to 
be rational functions, with rational coefficients, of the elements of A and B. 

EXERCISES 

1. Compute matrices P and Q for each of the matrices A of Ex. 1 of Section 2.6 
such that PAQ has the form (2.30). Hint: If A is m by n, we may obtain P by ap- 
plying those elementary row transformations used in that exercise to I m and simi- 
larly for Q. (The details of an instance of this method are given in the illustrative 
example at the end of Section 8.) 

2. Show that the product AB of any three-rowed square matrices A and B of 
rank 2 is not zero. Hint: There exist matrices P and Q such that A = PAQ has the 
form (2.30) for r = 2. Then, if AB = 0, we have A*B* = where B Q Qr l B has 
the same rank as B and may be shown to have two rows with elements all zero. 

3. Compute the ranks of A, B, AB for the following matrices. Hint: Carry A 
into a simpler matrix A Q = PA by row transformations alone, B into B Q = BQ by 
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column transformations alone, and thus compute the rank of AoB<> = P(AE)Q in- 
stead of that of AB. 



a) A = 2 3 4 , B 



B 



(3-52 4^ 
1-213 
1-1 0-2 
5-83 5/ 

8. Bilinear forms. As we indicated at the close of Chapter I, the problem 
of determining the conditions for the equivalence of two forms of the same 
restricted type is customarily modified by the imposition of corresponding 
restrictions on the linear mappings which are allowed. We now precede 
the introduction of those restrictions which are made for the case of bi- 
linear forms by the presentation of certain notations which will simplify our 
discussion. 

The one-rowed matrices 

(31) x' = (xi, . . . , x m ) , y' = (yi, . . . , y n ) 

have one-columned matrices x and y as their respective transposes. We 
let A be the m by n matrix of the system of equations (1) and see that this 
system may be expressed as either of the matrix equations 

(32) x = Ay , x' = y'A' . 

We have called (1) a nonsingular linear mapping if m = n and A is non- 
singular. But then the solution of (1) for yi, . . . , y n as linear forms in 
xi, . . . , x n is the solution 



(33) y = A- 1 x , y' = z'(A')- 1 = x'(A~y 

of (32) for y in terms of x (or y' in terms of x'). We shall again consider 
m by n matrices A and variables Xi and y, and shall now introduce the no- 
tation 

(34) . x = P'u 
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for a nonsingular linear mapping carrying the Xi to new variables Uk, for i, 
fc = 1, . . . , m, so that the transpose P of P r is a nonsingular m-rowed 
square matrix and u 1 = (u\, . . . , Um). Similarly, we write 

(35) y = Q 

for a nonsingular n-rowed square matrix Q and v' = (v\ t . . . , v n ). We now 
return to the study of bilinear forms. 

A bilinear form / = Zziauj//, for i = 1, . . . , m and j ~ 1, . . . , n, is a 
scalar which may be regarded as a one-rowed square matrix and is then the 
matrix product 

(36) / = x'Ay . 

Here x and y are given by (31), and we call the m by n matrix A the matrix 
of/, its rankle rank of/. Also let g = x'By be a bilinear form in #1, . . . , x m 
and ?/i, . . . , y n with m by n matrix B. Then we shall say that / and g are 
equivalent if there exist nonsingular linear mappings (34) and (35) such 
that the matrix of the form in ui, . . . , u m and Vi, . . . , v n into which / is 
carried by these mappings is B. But if (34) holds, then x' = u'P and 

/ = (u'P)A(Qv) = u'(PAQ)v , 

so that B = PAQ and A are equivalent. Thus, two bilinear forms f and g 
are equivalent if and only if their matrices are equivalent. By Theorem 9 we 
see that two bilinear forms are equivalent if and only if they have the same 
rank. It follows also that every bilinear form of rank r is equivalent to 
the form 

(37) zit/i + . . . + x r y r . 

These results complete the study of the equivalence of bilinear forms. 

ILLUSTRATIVE EXAMPLE 
We shall find nonsingular linear mappings (34) and (35) which carry the form 



into a form of the type (37). The matrix of / is 



f 2-3 1^ 

-1 5 

W6 3 W 
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We interchange the first and second rows of A, add twice the new first row to the 
new second row, add 6 times the new first row to the third row, and obtain 




which evidently has rank 2. We then add the second row to the third row, multiply 
the first row by 1, the second row by J, and obtain 

/I 0-5\ 

PA= o i -y 
\o o o/ 

The matrix P is obtained by performing the transformations above on the three- 
rowed identity matrix, and hence 

/ -1 0\ 

P-f-* -I o . 

\ 1 -4 I/ 

We continue and carry PA into PAQ of the form (2.30) f or r = 2 by adding five 
times the first column and V times the second column of PA to its third column. 
Then 

/I 5\ 
Q = 1 V - 

\0 17 

The corresponding linear mappings (34) and (35) are given respectively by 
f xi = - Jtt2 + Uz ( Vi = vi + 5^3 

I Z 2 = -Ui - ftt2 ~ 4W 3 , I 2/2 = t>2 

I Z 8 = ^3 i 2/3 = ^3 

We verify by direct substitution that 



(-6vi - 30t; 8 + 3i; 2 



as desired. 

EXERCISES 

1. Use the method above to find nonsingular linear mappings which carry the 
following bilinear forms into forms of the type (37). 
o) 2x<yi + 3x# 2 - x&i - 2x22/2 
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c) 2xiyi + 3xi2/ 2 ~ Zi2/a + 5x 2 2/i + 2x 2 y 2 + x 2 y s + 3x 3 2/i - 

d) 2xiyi - Xiy 2 + 3xi2/ 8 + zi2/4 - 2x 2 yi 

- X 3 2/3 + 2X32/4 + 4X42/1 - X 4 t/2 + OJ42/3 + 5X42/4 

2. Use the method above to find a nonsingular linear mapping on Xi, X2, x 8 such 
that it and the identity mapping on 2/1, y 2 , y* carry the following forms into forms 
of the type (37). 

a) 



c) Xii/i - 4xi2/ 2 + Zi2/ 8 - x 2 2/i + x 2 y 2 

d) 3 

e) - 



2x42/2 - 

3. Find a nonsingular linear mapping on 2/1, 2/2, 2/3 such that it and the identical 
mapping on xi, x 2 , x 3 carry the forms of Ex. 2 into forms of the type (37). Hint: The 
matrices A of the forms of Ex. 2 are nonsingular so that the corresponding matrices 
P have the property PA = I. Then P = A" 1 , AQ = / has the unique solution 
Q = P. 

4. Use elementary row transformations as above to compute the inverses of the 
matrices of Section 6, Ex. 4. 

5. Use elementary column transformations to compute the inverses of the fol- 
lowing matrices. 





10 -4 1 -1\ 
-1301 
2-3 O 
1 3 -1 I 

9. Congruence of square matrices. There are some important problems 
regarding special types of bilinear forms as well as the theory of equivalence 
of quadratic forms which arise when we restrict the linear mappings we use. 
We let m = n so that the matrices A of forms/ = x'Ay are square matrices. 
Then (34) and (35) are called cogredient mappings if Q = P' . Then/ and g 
are clearly equivalent under cogredient mappings if and only if 

(38) B = PAP' . 

We shall call A and B congruent if there exists a nonsingular matrix P 
satisfying (38). 
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We shall not study the complicated question as to the conditions that 
two arbitrary square matrices be congruent but shall restrict our attention 
to two special cases. 

Before passing to this study we observe that Lemma 2 and Section 7 
imply that A and B are congruent if and only if A may be carried into B by 
a sequence of operations each of which consists of an elementary row trans- 
formation followed by the corresponding column transformation. Thus 
P = P, . . . PI, where the P* are elementary transformation matrices, B = 
P. ... PjAPJ ... P.' = P. ... P 2 (P!APDP 2 . . . P,', and so forth as de- 
sired. We shall speak of such operations on a matrix A as cogredient ele- 
mentary transformations of the three types and shall use them in our study 
of the congruence of matrices. 

10. Skew matrices and skew bilinear forms. A square matrix A is called 
symmetric if A' = A, and skew if A' = A. If B = PAP' is congruent to 
A then B' = PA'P', B is symmetric if and only if A is symmetric, B is 
skew if and only if A is skew. 

If A = (an) is a skew matrix, then an = an for all i and j. Hence 
an = a>a and consequently every diagonal element of A is zero. We use 
this result in the proof of 

Theorem 10. Two n-rowed skew matrices are congruent if and only if they 
the same rank r. Moreover r is an even integer 2t, and every skew matrix is 
thus congruent to a matrix 

/O -It 0\ 

(39) It . 

\0 O/ 

For either A = = PAP' for every P and our result is trivial, or some 
an T* 0. We may interchange the ith row and first row, the ^'th and sec- 
ond row and thus also the corresponding columns by cogredient elementary 
transformations of type 1. We thus obtain a skew matrix H = (/&,-) con- 
gruent to A and with a,- = hiz j^ 0, hzi = ^12, An = ^22 = 0. Multiply 
the first row and column of H by h^ and obtain a skew matrix C = (c t -,) 
congruent to A and with Ci 2 = 1, c 2 i = 1, en = c 22 = 0. We now apply 
a sequence of cogredient elementary transformations of type 2 where we 
add c/i times the second row of C to its jth row as well as c, 2 times the 
first row of C to its jth row for j = 3, . . . , n, and thus obtain the skew 
matrix 

(40) Ac = 



\ E _(0 -1\ 
A,) ' * - \1 OJ ' 
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The matrix Ao is congruent to A, A i is necessarily a skew matrix of n 2 
rows, and any cogredient elementary transformation on the last n 2 rows 
and columns of A induces corresponding transformations on AI, carrying 
it to a congruent skew matrix H\ and carrying Ao to a congruent skew 
matrix 

(E 

VO H 

It follows that, after a finite number of such steps, we may replace A by 
a congruent skew matrix 

(41) G = diag {JEi, . . . , E t , 0, . . . , 0} 

with each E k = E. Clearly, G and A have rank 2t. If B also is a skew ma- 
trix of rank 2, then B is congruent to (41) and to A, both A and B are con- 
gruent to (39). 

If A is a skew matrix, the corresponding bilinear form x'Ay is a skew 
bilinear form. Hence, two such forms are equivalent under cogredient non- 
singular linear mappings if and only if they have the same rank. Moreover, 
if / is a skew bilinear form of rank 2t it is equivalent under cogredient non- 
singular linear mappings to (xiyz x^yi) + . . . + 



EXERCISES 

Use a method analogous to that of the exercises of Section 8 to find cogredient 
linear mappings carrying the following skew forms to forms of the type above. 
a) 
b) 
c) 

+ 80:22/4 - 



11. Symmetric matrices and quadratic forms. The theory of symmetric 
matrices is considerably more extensive and complicated than that of skew 
matrices, and we shall obtain only some of its most elementary results. Our 
principal conclusion may be stated as 

Theorem 11. Every symmetric matrix A is congruent to a diagonal matrix 
of the same rank as A. 

We may evidently assume that A = A' =^ 0, and we shall prove first 
that A is congruent to a symmetric matrix H = (hij) with some diagonal 
element ha ^ 0. This is true for A = H if some diagonal element of A is 
not zero. Otherwise there is some a</ ^ 0, a, = a< ; -, and an = a/, = 0. We 
then obtain H as the result of adding the jth row of A to its ith row and 
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the jth column of A to its iih column, ha = a,-* + a ; = 2a,- j^ as de- 
sired. We now permute the rows and corresponding columns of H to ob- 
tain a congruent symmetric matrix C with en = ha 7* 0. Then we add 
~~cTiCki times the first row to its ifcth row, follow with the corresponding col- 
umn transformation, and obtain a symmetric matrix 



(42) 



/On \ 
\0 Aj 



congruent to A. Clearly, A\ is a symmetric matrix with n 1 rows. As 
in the proof of Theorem 10 we carry out a finite number of such steps, and 
it is clear that we ultimately obtain a diagonal matrix. It is a matrix equiv- 
alent to A and must have the same rank. 

The result above may be applied to obtain a corresponding result on 
symmetric bilinear forms of rank r, that is, bilinear forms / = x'Ay defined 
by a symmetric matrix A of rank r. Theorem 11 then states that /is equiva- 
lent under cogredient transformations to a form 

(43) aiXitfi + . . . + a r x r y r . 



The results of Theorem 11 may also be applied to quadratic forms/ = 
f(xi, . . . , oj n ). As we saw in Section 1.7, / is the one-rowed square matrix 
product 



(44) / = x'Ax = 

u-i 

for a symmetric matrix A. We call the uniquely determined symmetric 
matrix A the matrix of f and its rank the rank of f. Now in Section 1.9 we 
defined the equivalence of any quadratic form / and any second quadratic 
form g = y'By. We may then use the notations developed above and see 
that / and g are equivalent if and only if A and B are congruent. For if 
our nonsingular linear mapping is represented by the matrix equation x = 
P'u, then x' = u'P is a consequence, and/ = u'(PAP')u,J and g are equiv- 
alent if and only if B = PAP'. Thus Theorem 11 states that every quad- 
ratic form of rank r is equivalent to 

(45) ap* + . . . + a r x* . 

The form (45) is of course to be regarded as a form in xi, . . . , x n with 
matrix diag {ai, . . . , a r , 0, . . . , 0}. However, it may be regarded as a 
form in XL, . . . , x r with nonsingular matrix diag {ai, . . . , a r j. We shall 
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call a quadratic form / = x'Ax in xi, . . . , x n with nonsingular matrix A 
a nonsingular form, and we have shown that every quadratic form of rank r in 
n variables may be written as a nonsingular form in r variables whose matrix 
is a diagonal matrix. 

EXERCISES 

1. What is the symmetric matrix A of the following quadratic forms? 

a) 3x1 + 0:1X2 + 2xiX 3 3x 2 x 4 + x\ 

6) 2xix 2 - 8x3X4 + x\ 

c) x\ - 3x^2 + 2xiX 4 - 3xJ 



2. Find a nonsingular matrix P with rational elements for each of the following 
matrices A such that PAP' is a diagonal matrix. Determine P by writing A = 
IAI' and by applying cogredient elementary transformations. 



2 3 
00-2-3 
2 -2 -1 
-3 -1 -3 





7 6 - 

, 6 15 1 

e ) I - - -"0023 

-l 1 3 

'-3-41 0\ / 3 2 1 - 

-4 -5 - 5\ ,J 2 21-1 

1 1 1 *> 1 1 1 
~5 1 14/ \-l -1 O/ 

3. Write the symmetric bilinear forms whose matrices are those of (a), (6), (c), 
and (d) of Ex. 2 and use the cogredient linear mappings obtained from that exercise 
to obtain equivalent forms with diagonal matrices. 

4. Apply the process of Ex. 3 to (e), (/), (g), and (h) of Ex. 2 for quadratic forms. 

5. Which of the matrices of Ex. 2 are congruent if we allow any complex num- 
bers as elements of P f 

6. Show that the forms / = xf + x\ and g = xf x\ are not equivalent under 
linear mappings with real coefficients. Hint: Consider the possible signs of values 
of / and of g. 
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12. Nonmodular fields. In our discussion of the congruence and the 
equivalence of two matrices A and B the elements of the transformation 
matrices P and Q have thus far always been rational functions, with rational 
coefficients, of the elements of A and B. While we have mentioned this 
fact before, it has not, until now, been necessary to emphasize it. But the 
reader will observe that we have not, as yet, given conditions that two sym- 
metric matrices be congruent, and our reason is that it is not possible to do 
so without some statement as to the nature of the quantities which we allow 
as elements of the transformation matrices P. We shall thus introduce an 
algebraic concept which is one of the most fundamental concepts of our 
subject the concept of a field. 

A field of complex numbers is a set F of at least two distinct complex 
numbers a, fe, . . . , such that a + &, ab, a fc, and a/c are in F for every 
a, 6, c 7* in F. Examples of such fields are, then, the set of all real num- 
bers, the set of all complex numbers, and the set of all rational functions 
with rational coefficients of any fixed complex number c. 

If F is any field of complex numbers, the set K = F(x) of all rational 
functions in x with coefficients in F is a mathematical system having prop- 
erties, with respect to rational operations, just like those of F. Now it is 
true that even if one were interested only in the study of matrices whose 
elements are ordinary complex numbers there would be a stage of this study 
where one would be forced to consider also matrices whose elements are 
rational functions of x. Thus we shall find it desirable to define the concept 
of a field in such a general way as to include systems like the field K defined 
above. We shall do so and shall assume henceforth that what we called con- 
stants in Chapter I and scalars thereafter are elements of a fixed field F. 

The fields we have already mentioned all contain the complex number 
unity and are closed with respect to rational operations. But it is clearly 
possible to obtain every rational number by the application to unity of a 
finite number of rational operations. Thus all our fields contain the field 
of all rational numbers and are what are called nonmodular fields. The fields 
called modular fields will be defined in Chapter VI. We now make the fol- 
lowing brief 

DEFINITION. A set of elements F is said to form a nonmodular field if F 
contains the set of all rational numbers and is closed with respect to rational 
operations such that the following properties hold: 

I. (a + b) + c = a + (b + c) , (ab)c = a(bc) ; 
II. a(b + c) = ab + ac ; 
III. a + b = b + a , ab = ba 
for every a, b, c of F. 
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The difference a 6 is always defined in elementary algebra to be a solu- 
tion x in F of the equation 

(46) x + b = a , 

and the quotient a/6 to be a solution y in F of the equation* 

(47) yb = a . 

Thus our hypothesis that F is closed with respect to rational operations 
should be interpreted to mean that any two elements a and b of F determine 
a unique sum a + b and a unique product ab in F such that (46) has a 
solution in F and (47) has a solution in F if b j 0. In the author's Modern 
Higher Algebra it is shown that the solutions of (46) and (47) are unique. 
In fact, it may be concluded that the rational numbers and 1 have the 
properties 

(48) a + = al = a , aO = 0, 

for every a of F and that there exists a unique solution x = aoix -}- a ~ 

and a unique solution y = 6" 1 of yb = 1 for 6^0. Then the solutions 
of (46) and (47) are uniquely determined by z = a + ( b), y = ab~ l . 

We also see that the rational number 1 is defined so that ( 1) + 
1=0, and thus ( 1 + l)a = a = 0, whereas ( 1 + l)a = 1 a + 

1 a = 1 a + a. Hence a = 1 a. It is also true that (a) = 
a, ( a) ( 6) = ab for every a and b of a field F. 

EXERCISES 

1. Let a, 6, and c range over the set of all rational numbers and F consist of all 
matrices of the following types. Prove that F is a field. (Use the definition of ad- 
dition of matrices in (52).) 

N /a -2b\ ., '~ ' r "- x /a b 

a ^(b a) b 

2. Show that if a, 6, c, and d range over all rational numbers and i 2 = l, the 
set of all matrices of the following kind form a quasi-field which is not a field. 

/a + bi 3(c -) 
\c di a 

* Note that in view of III the property II implies that (b + c)a = ba + ca, and the 
existence of a solution of (46) is equivalent to that of b + x = a, of (47) to that of 
by = a. But there are mathematical systems called quasi-fMs in which the law ab = ba 
does not hold, and for these systems the properties just mentioned must be made as 
additional assumptions. 
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13. Summary of results. The theory completed thus far on polynomials 
with constant coefficients and matrices with scalar elements may now be 
clarified by restating our principal results in terms of the concept of a field. 
We observe first that if f(x) and g(x) are nonzero polynomials in x with 
coefficients in a field F then they have a greatest common divisor d(x) with 
coefficients in F, and d(x) = a(x)f(x) + b(x)g(x) for polynomials a(x), b(x) 
with coefficients in F. 

We next assume that A and B are two m by n matrices with elements in 
a field F and say that A and B are equivalent in F if there exist nonsingular 
matrices P and Q with elements in F such that PAQ = B. Then A and B 
are equivalent in F if and only if they have the same rank. Moreover, corre- 
spondingly, two bilinear forms with coefficients in F are equivalent in F if 
and only if they have the same rank. Since the rank of a matrix (and of a 
corresponding bilinear form) is defined without reference to the nature of 
the field F containing its elements, the particular F chosen to contain the 
elements of the matrices is relatively unimportant for the theory. 

If A and B are square matrices with elements in a field F, then we call A 
and B congruent in F if there exists a nonsingular matrix P with elements 
in F such that PAP f = B. Similarly, we say that the bilinear forms x'Ay 
and x'By are equivalent in F under cogredient transformations* if A and B 
are congruent in F. When A 1 = -A the matrix A is skew, and every ma- 
trix B congruent in F to A is skew, two skew matrices with elements in F 
are congruent in F if and only if they have the same rank. Hence, the pre- 
cise nature of F is again unimportant. 

Let A = A' be a symmetric matrix with elements in F so that any ma- 
trix B congruent in F to A also is a symmetric matrix with elements in F. 
Then two corresponding quadratic forms x'Ax and x f Bx are equivalent in 
F if and only if A and B are congruent in F. Moreover, we have shown 
that every symmetric matrix of rank r and elements in F is congruent in F 
to a diagonal matrix diag (a a , . . . , a r , 0, . . . , 0} with a* ^ in F and 
that correspondingly every quadratic form x'Ax is equivalent in F to 



The problem of finding necessary and sufficient conditions for two quad- 
ratic forms with coefficients in a field F to be equivalent in F is one in- 
volving the nature of F in a fundamental way, and no simple solution of this 
problem exists for F an arbitrary field. In fact, we can obtain results only 
after rather complete specialization of F, and these results may be seen to 
vary as we change our assumptions on F. 

* We leave to the reader the explicit formulation of the definitions of equivalence in 
F of two forms, of two bilinear forms, and of two bilinear forms under cogredient linear 
mappings, where all the forms considered have elements in F. 
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The simplest conditions are those given in 

Theorem 12. Let F be a field with the property that for every a of F there 
exists a quantity b such that b 2 = a. Then two symmetric matrices with ele- 
ments in F are congruent in F if and only if they have the same rank. 

For every A = A' of rank r is congruent to A = diag ^{ai, . . . , a r , 0, 
. . . , 0} for a 7* and a, = 6? for 6< ^ in F. Then if P = diag {&rS - > 
67 1 , 0, . . . , 0}, PA P' = diag [I r , 0}. If also B = B' has rank r, then 5 is 
also congruent in F to diag {/ r , 0} and hence to A. The converse follows 
from Theorem 2.2. 

We then have the obvious consequences. 

COROLLARY I. Two symmetric matrices whose elements are complex num- 
bers are congruent in the field of all complex numbers if and only if they have 
the same rank. 

COROLLARY II. Let F be the field of either Theorem 12 or CoroUary I. 
Then two quadratic forms with coefficients in F are equivalent in F if and only 
if they have the same rank. Hence every such form of rank r is equivalent 
in F to 

(49) x? + . . . + x? . 

14. Addition of matrices. There is one other result on symmetric matrices 
over an arbitrary field which will be seen to have evident interest when we 
state it. Its proof involves the computation of the product of two matrices 



which have been partitioned into two-rowed square matrices whose ele- 
ments are rectangular matrices. If these matrices were one-rowed square 
matrices, that is to say in F, we should have the formula 



But it is also true that if the partitioning of any A and B is carried out so 
that the products in (51) have meaning and if we define the sum of two 
matrices appropriately, then (51) will still hold. Thus (51) will have major 
importance as a formula for representing matrix computations. 

We now let A = (ay) and B = (6</)? where i = 1, . . . , m and j = 1, 
. . . , n, so that A and B are m by n matrices. Then we define 

(52) S = A + B = (ay) , SH = an + &/ 

(i = 1, . . . ,m;j = 1, . . . ,n) . 
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We have thus defined addition for any two matrices of the same shape such 
that A + B is the matrix whose elements are the sums of correspondingly 
placed elements in A and B. 

The elements of our matrices are in a field F, and a^ + by = 6/ + a/. 
If C = (dj) is also an m by n matrix, we have (a</ + 6,-,-) + c*/ = an + 
(bij + dj). Hence we have the properties 

(53) A+B = B + A, (A+B) + C = A + (B + C) . 

We observe also that if is the m by n zero matrix, then A + = A, 
A + (~1-A) = 0. 

Now let C = (cjk) be any n by q matrix. Then 

(54) (A + B)C = AC + BC . 

For this equation is clearly a consequence of the corresponding property 
of F, that is, of 

(55) (an + &</)<;/* = a^k + bi,Cjk 

We have thug seen that addition of matrices has the properties (53) al- 
ways assumed for addition of the elements of our matrices and that the 
law (54), which we call the distributive law for matrix addition and multipli- 
cation, also holds. Clearly if D is a matrix such that DA is defined, then 
similarly we have 

(56) D(A + B) = DA + DB . 

Observe, however, that if n > 1 and A and B are n-rowed square mat- 
rices, then | A + B\ and \A \ + \B\ are usually not equal. For example, 
if A and B are equal, then \A\ + \B\ = 2|A| while |A + B\ = |2A| = 



Let us now apply our definitions to derive (51). We let A = (a;/) be an 
mbyn matrix, B = (&,-*) be an n by q matrix, and A\ be an s by t matrix, 
so that AI = (a*/) but now with i = 1, . . . , s and j = 1, . . . , t. Then 
(51) has meaning only if B\ has t rows, and we thus assume that Bi is a 
matrix of t rows and g columns. Our partitioning is now completely de- 
termined and necessarily A* has s rows and n t columns, AS has m s 
rows and t columns, A 4 has m s rows* and n t columns, #2 has t rows 
and q g columns, B 8 has n t rows and g columns, J5 4 has n t rows 
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and q g columns. The element in the ith row and fcth column of AB may 
clearly be expressed as 



(i = 1, . . . , ro; k = 1, . . . , q) 
But this equation is equivalent to the matrix equation given by 

(57) A = (A D 2 ) , B = (^} , AB = DJE, + Drf* , 
where we define Di, Z>2, JE?i, #2 by 

(58) Di = , D 2 = 2 , Ei - (fit, B.) , E 2 = (B,, B 4 ) 



Moreover, we may obtain (51) from (57) by simply using the ranges 1, 2, 
. . . , s and s + 1, . . . , m for i separately, as well as 1, 2, . . . , g and 
g + 1, . . . , q for j separately. In matrix language we have used (58) and 
computed 

as the result of partition of matrices and then have used (57) and addition 
of matrices in (59) to give (51). 

We shall now apply the process above to prove the following theorem on 
symmetric matrices mentioned above. 

Theorem 13. Let Ai and BI be r-rowed nonsingular symmetric matrices 
with elements in F, and A and B be the corresponding n-rowed symmetric 
matrices 

<> Ho'S). Ho' 

of rank r. Then A and B are congruent in F if and only if AI and BI are 
congruent in F. 

For if AI and BI are congruent in F there exists a nonsingular matrix Pi 
such that PiAiP( = BI. Then P = diag {Pi, I n - r } is nonsingular, and com- 
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putation by the use of (51) gives PAP' = B. Conversely, if PAP 7 = B for 
a nonsingular matrix P we may write 



I Pi p *} 
\p* p*)' 



where Pi is an r-rowed square matrix, and we shall have 

p *\ PA p,.p(Af( Af' t \_(P l A l P( P^ 
P'* P'J' ~MO J-^A^ PtA 

But then B 1 P 1 A 1 P( J and P l must be nonsingular since \Bi\ = 
|PJ| | A! | |PJ 5^ 0. Hence B l and A l are congruent in F. 

The result above thus states that, if / and g are quadratic forms with 
coefficients in F so that / and g are equivalent only if they have the same 
rank r, then / may be written as a nonsingular form / in r variables, g may be 
written as a nonsingular form g Q in r variables, and finally the original forms 
/ and g are equivalent in F if and only if / and g Q are equivalent in F. 

ORAL EXERCISES 
1. Computed + Bif 



a) A 





B= 

2. Verify that (A + B)' = A' + B' for any m by n matrices A and B. 

3. Show that every n-rowed square matrix A is expressible uniquely as the sum 
of a symmetric matrix B and a skew matrix C. Hint: Put A == B + C with B = B', 
(7 = 0', compute A', and solve. 

15. Real quadratic forms. We shall close our study of symmetric mat- 
rices and hence of quadratic forms with coefficients in F as well by consider- 
ing the case where F is the field of all real numbers. Let then / = x'Ax 
have rank r so that we may take / = a x xf + + a r xl for real a^ ^ 0. 
We now call the number of positive a the index i of / and prove 

Theorem 14. Two quadratic forms with real coefficients are equivalent in 
the field of all real numbers if and only if they have both the same rank and the 
same index. . 
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* 

For proof we observe, first, that by Section 14 there is no loss of generality 
if we assume that/ = x'Ax where A is a nonsingular diagonal matrix, that 
is, r = n. Moreover, there is clearly no loss of generality if we take A = 
diag {di, . . . , d t , dt+i, . . . , d r \ for positive d. Then there exist real 
numbers p ^ such that p* = d iy and if P is the nonsingular matrix diag 
{PT 1 > > P7 1 } then PAP 1 is the matrix diag {1, . . . , 1, -1, . . . , -1} 
with t elements 1 and r t elements 1. Thus, / is equivalent in F to 

(61) (x\ + . . . + *?) - (* +1 + ...+*). 

Now, if 0has the same rank and index as/, it is equivalent in F to (61) and 
hence to /. Conversely, let g have rank r = n and index s so that g is equiva- 
lent in F to 

(62) (*?+...+ *2) - (J +l + . . . + 3 . 

We propose to show that s = t. There is clearly no loss of generality if we 
assume that s ^ t and show that if s > t we arrive at a contradiction. 

Hence, let s > t. Our hypothesis that the form / defined by (61) and 
the form g defined by (62) are equivalent in the real field implies that there 
exist real numbers d^ such that if we substitute the linear forms 

(63) Xi = dnyi + . . . + d in y n (i = 1, . . . , n) , 

in / = f(x l9 . . . , x n ), we obtain as a result (y\ + . . . + yj) - (yj +1 + 
. . . + yj). Put Xi = x 2 = . . . = x t = and y, +1 = . . . = y n = in (63) 
and consider the resulting t equations 

(64) AMI + . . . + cky. = (t = 1, . . . , *) . 

These are t linear homogeneous equations in s > t unknowns, and there 
exist real numbers v\, . . . , v 9 not all zero and satisfying these equations. 
The remaining n t equations of (63) then determine the values of x, as 
certain numbers Uj for j = t + 1, . . . , n, and we have the result h = 
/(O, 0, . . . , 0, w l+1 , . . . , tO = t>f + . . . + 0J > 0. But clearly ft = 
(ttf +1 + . + ul) ^ 0, a contradiction. 

We have now shown that two quadratic forms with real coefficients and 
the same rank are equivalent in the field of all complex numbers but that, 
if their indices are distinct, they are inequivalent in the field of all real 
numbers. We shall next study in some detail the important special case 
t = r of our discussion. 
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A symmetric matrix A and the corresponding quadratic form / = x'Ax 
are called semidefinite of rank r if A is congruent in F to a matrix 



air 

<> 



\ 

; 



for a 5^ in F. Thus / is semidefinite if it is equivalent to a form a(x\ + 
. . . + xj). We call A and/ definite if r = n, that is, A is both semidefinite 
and nonsingular. 

If F is the field of all real numbers, we may take a = 1 and call A and / 
positive or take a = 1 and call A and / negative. Then A and / are nega- 
tive if and only if A and / are positive. Thus we may and shall re- 
strict our attention to positive symmetric matrices and positive quadratic 
forms without loss of generality. 

If f(xi, . . . , x n ) is any real quadratic form, we have seen that there 
exists a nonsingular transformation (63) with real d^ such that / = y\ + 

+ !/f (y\+i ++!/?) If Ci, . . . , c n are any real numbers and 
if we put Xi = d in (63), there exist unique solutions y, = dj of the result- 
ing system of linear equations, and the dj may readily be seen to be all zero 
if and only if the c are all zero. Now if t < r, we have/ < for y\ = ... = 
y t = 0, yi+i = l,/(ci, . . . , c n ) < 0. Conversely, if /(ci, . . . , c n ) < 0, then 
t < r. For otherwise / = y\ + . . . + j/J, /(ci, . . . , c n ) = d\ + . . . + 
dr = 0. If t = r < n, then we put y r+ \ = 1 and all other y/ = and have 
d, . . . , c n not all zero such that /(ci, . . . , c n ) = 0. Hence, if /(ci, . . . , 
c n ) > for all real c* not all zero, the form/ is positive definite. Conversely, 
iff is positive definite, we have / = y\ + . . . + 2/ n , /(ci, . . . , c n ) = d\ + 

+ d% > for all dj not all zero and hence for all c not all zero. We 
have proved 

Theorem 15. A real quadratic form f(xi, . . . , x n ) is positive semidefinite 
if and only if f (ci, . . . , c n ) ^ for all real Ci, is positive definite if and only 
if f (ci, . . . , c n ) > for all real ci not all zero. 

As a consequence of this result we shall prove 

Theorem 16. Every principal submatrix of a positive semidefinite matrix 
is positive semidefinite, every principal submatrix of a positive definite matrix 
is positive definite. 

For a principal submatrix B of a symmetric matrix A is defined as any 
m-rowed symmetric submatrix whose rows are in the tith, . . . , i m th rows 
of A and whose corresponding columns are in the corresponding columns 
of A. Put XQ = (x^, . . . , z m ) so that g = x f Q Bx Q is the quadratic form with 
B as matrix, and we obtain g from / by putting Xj = in / for j 9^ i*. 
Clearly, if / <> for all z = a, then g ^ for all values of the x ik , and 



EQUIVALENCE OF MATRICES AND OF FORMS 65 

hence B is positive semidefinite by Theorem 15. If A is positive definite 
and B is singular, then g = for x ik not all zero, and hence / = for the 
Xj above all zero and for the x ik not all zero, a contradiction. 

The converse of Theorem 16 is also true, and we refer the reader to the 
author's Modern Higher Algebra for its proof. We shall use the result just 
obtained to prove 

Theorem 17. Let A be an m by n matrix of rank r and with real elements. 
Then AA' is a positive semidefinite real symmetric matrix of rank r. 

For we may write 

A- P (I' <>\o 00' 

A - p (o o) Q > QQ 

where P and Q are nonsingular matrices of m and n rows, respectively. 
Then QQ' is a positive definite symmetric matrix, and we partition QQ' so 
that Qi is an r-rowed principal submatrix of QQ'. By Theorem 16 the ma- 
trix Qi is positive definite, 



is congruent to the positive semidefinite matrix C of rank r and hence has 
the property of our theorem. 

EXERCISE 

What are the ranks and indices of the real symmetric matrices of Ex. 2, Sec- 
tion 11? 



CHAPTER IV 
LINEAR SPACES 
1. Linear spaces over a field. The set V n of all sequences 

(1) U = (d, . . . , Cn) 

may be thought of as a geometric n-dimensional space. We assume the 
laws of combination of such sequences of Section 1.8 and call u a point or 
vector, of V n . We suppose also that the quantities Ci are in a fixed field F 
and call c the ith coordinate of u, the quantities Ci, . . . , c n the coordinates 
of u. The entire set V n will then be called the n-dimensional linear space 
over F. 

The properties of a field F may be easily seen to imply that 

(2) (u + v) + w = u + (v + w) , u + v = v + u 

(3) a(bu) = (o6)w , (a + V)u = aw + bu , a(w + v) = aw + aw 



for all a, 6 in F and all vectors w, v, w of F n . The vector which we designate 
by O.is that vector all of whose coordinates are zero, and we have 

(4) w + = u, u+(u) = Q, 
where u = ( Ci, . . . , c n ). Then 

(5) w = , 1 u = u , u = 1 u . 

Note that the first of (5) is the quantity of F, and the second zero is the 
zero vector. We shall use the properties just noted somewhat later in an 
abstract definition of the mathematical concept of linear space, and we leave 
the verification of the properties (2), (3), (4), and (5) of V n to the reader. 

2. Linear subspaces. A subset L of V n is called a linear subspace of V n if 
au + bv is in L for every a and b of F, every u and v of L. Then it is clear 
that L contains all linear combinations 



(6) u = aiUi + . . . + a m u m , 

for a in F, and w^ in L. 
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We observe that the set of all linear combinations (6) of any finite num- 
ber m of given vectors, ui, . . . , u m is a linear subspace L of V n according 
to this definition. If now L is so defined, we shall say that u\, . . . , Um 
span the space L and shall write 

(7) L-lm,...,!!*}. 

It is clear that, if e,- is the vector whose jth coordinate is unity and whose 
other coordinates are zero, then ei, . . . , e n span V n . 

The space spanned by the zero vector consists of the zero vector alone 
and may be called the zero space and designated by L = {0}. In what fol- 
lows we shall restrict our attention to the nonzero subspaces L of V n , and, 
for the time being, we shall indicate, when we call L a linear space over F } 
that L is a linear subspace over F of some V n over F. 

3. Linear independence. Our definition of a linear space L which is a 
subspace of V n implies that every subspace L contains the zero vector. If 
L = {MI, . . . , u^, then = Oui + . . . + Oum. Hence the zero vector of 
L is expressible in the form (6) in at least this one way. We shall say that 
ui, . . . , u m are linearly independent in F or that ui, . . . , u m are a set of 
linearly independent vectors (of V n over F), if there is no other such expres- 
sion of in the form (6). Thus, Ui, . . . , u m are linearly independent if it 
is true that a linear combination ami + . . . + (J^u m = if and only if 
ai = a 2 = ... a m = 0. If Ui, . . . , Um are not linearly independent in F, 
we shall say that they are linearly dependent in F. 

A set of vectors Ui, . . . , Um are now seen to be linearly independent in F 
if and only if the expression of every u of L = {ui, . . . , u m } in the form (6) 
is unique. For this property clearly implies linear independence as the spe- 
cial case u = 0. Conversely, if ui, . . . , u m are linearly independent and 
u = aiUi + . . . + a m u m = biui + . . . + 6 m u m , then = (ai - bi)ui + 
. . . + (a m b m )u m , a< &< = 0, a = & as desired. We now make the 

DEFINITION. LetL- {ui, . . . , u m } over F andui, . . . , u m be linearly inde- 
pendent in F. Then we shall call Ui, . . . , u m a basis over F of L and indicate 
this by writing 



(8) L = UiF + . . . + u m F . 

It is evident that V n = e\F + . . . + eJF. But we may actually show 
that every subspace L spanned by a finite number of vectors of V n has a 
basis in the above sense. Observe, first, that the definition of linear inde- 
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pendence in case m = 1 is that au = only if a = and thus that u 5^ 0. 
Then let ui, . . . , u r be linearly independent vectors of V n and u ^ be 
another vector. Then either u\ } . . . , u rj u are linearly independent in F or 
a>iUi + . . . + a r u r + a Q u = for a* not all zero. If a - 0, then aiUi + 
. . . + OrU r = 0, from which a\ = . . . = a r = 0, a contradiction. But then 
u = ( a l Oi)ui + . . . + ( a^ l a n )u r is in [u^ . . . , u r }. It follows that, 
if ui, ... | Mm are any m distinct nonzero vectors, we may choose some 
largest number r of vectors in this set which are linearly independent in F 
and we will then have the property that all remaining vectors in the set are 
linear combinations with coefficients in F of these r. We state this result as 
Theorem 1. Let L be spanned by m distinct nonzero vectors Ui, . . . , u m . 
Then L has a basis consisting of certain r of these vectors, 1 < r < m. 

EXERCISES 

1. Determine which of the following sets of three vectors form a basis of the 
subspace they span. Hint: It is easy to see whether some two of the vectors, say 
u\ and Ui are linearly independent. To see if ui, u^ HZ are linearly dependent we 
write uz = xu\ + yui and solve for x and y. 

a) MI = (1, 3, -3) , w 2 = (2, 5, -2) , u* = (1, 1, 6) 

6) Ul = (3, 1, 2) , u, = (4, 1, 3) , m = (-1, -1, 0) 

* c) tii = (-1, -1, 1) , u, = (3, 2, 1) , 7/3 = (7, 3, 9) 

d) u, = (1, -2, -1, 3) , u, = (2, -1, 1, 6) , u 3 = (0, -1, -3, 1) 

) ui = (1, 1, 1, -1) , u, = (2, 2, 2, -2) , u 3 = (1, 2, 3, 4) 

> /) tii = (1, -1, 1, -1) , u 2 = (1, 1, 1, 0) , u, = (5, -1, 5, -3) 

2. Show that the space L = {MI, w 2 , ^3} spanned by the following sets of vectors 
is 7s. Hint: In every case one of the vectors, say ui, has first coordinate not zero, 
and L contains u* x 2 ui = (0, 6 2 , c 2 ), u 3 XsUi = (0, 63, c*). Some linear combina- 
tion of these two quantities has the form (0, 0, d) and L contains e 3 = (0, 0, 1). It 
is easy to show then that L contains e 2 = (0, 1, 0) and e\ = (1, 0, 0), L = Fs. 

a) ui = (1, -2, 3) , u 2 = (2, 3, 1) , u 3 = (-1, 3, 2) 

6) m - (0, 1, 8) , t*2 - (1, -3, 6) , u* = (1, -1, 23) 

c) I* - (0, 3, 2) , u, = (0, 2, 1) , 11, (1, 5, 4) 

3. Determine whether or not the spaces spanned by the following sets of vectors 
wi, Ut coincide with those spanned by the corresponding vi, v 2 . Hint: If Li = 
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Lu = {tfi, 02} , there must exist solutions xi, a? 2 , x 3 , ou of the equations 
+ #2^2, t>2 = ^3^1 + #4^2 such that the determinant x\x #2X3 ^ 0. 



a) m=(3, -1,2,1), u, = (1, 4, 6, 1) , 

tn=(7, -11, -6, 1), t> 2 = (7, 2, 10, 3) 

6) U! = (1,2, -1,2), u 2 = (1,2,3,4), 

in=(l, 2, -13, -4), , = (6, 0, 4, 1) 

c) in = (1, -1, 2, -3) , u, = (2, -2, 4, -6) , 

v, = (3, -3, 6, -9) , * = (4, -4, 8, -9) 

d) i-(l, -1,0,3), ti, - (2, 1, 0, 1) , 

i(-l, -2, 1,2), n = (2, -2, 0, 6) 

e) in =(1, -2,0,1), u, = (2, -1,1,0), 

Vl = (3, -3, 1, 1) , t*=(-l, -1, -1,1) 

/) Wl = (1, 0, 1, -2) , 11, = (0,1, -1,2), 

in = (2, -2, 4, -8) , * = (1, 1, 0, 0) 

4. The row and column spaces of a matrix* We shall obtain the principal 
theorems on the elementary properties of linear spaces by connecting this 
theory with certain properties of matrices which we have already derived. 
Let us consider a set of m vectors, 

(9) Ui = (aa, . . . , a in ) (i = 1, . . . , rn) , 

of V n over F. Then we may regard w as being the tth row of the correspond- 
ing m by n matrix A = (a</) and the space L = [u\ y . . . , u m ] as being 
what we shall call the row space of A. Thus every m by n matrix defines a 
linear subspace of V nj every subspace of V n spanned by m vectors defines 
a corresponding m by n matrix. 

If P = (pki) is any q by m matrix, the product PA is a q by n matrix. 
The jth coordinate of the vector 

(10) W k = PklUl + . . . + 



is pkiQit + . . . + pkmOmh that is, the element in the fcth row and jth col- 
umn of PA. Hence the Wi row of PA is that linear combination of the rows 
of A whose coefficients form the kth row of P. 

We have now shown that every row of PA is in the row space of A. It 
follows that the row space of PA is contained in that of A. If P is nonsingu- 
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lar, then the result just derived implies that the row space of A = P~ l (PA) 
is contained in the row space of PA and therefore that these two linear 
spaces are the same. Thus we have 

LEMMA 1. Let P be an m-rowed nonsingular matrix. Then the row spates 
of PA and A coincide. 

In the proof of Theorem 3.6 we used the matrix product equivalent of 
the elementary transformation Theorem 2.4, and we shall now state this 
theorem as the useful 

LEMMA 2. Let Abeanmbyn matrix of rank r. Then there exist nonsingular 
matrices P and Q of m and n rows, respectively, such that 

(11) PA = (\ , AQ = (H 0) , 

where G and H have rank r, G is an r by n matrix, H is an m by r matrix. 

We use (11) and note that the rows of G differ from those of PA only in 
zero rows. Then the rows of G span the row space of PA. By Lemma 1 
we have 

LEMMA 3. The row spaces of G and A coincide. 

We shall use this result in the proof of 

Theorem 2. The r rows of G form a basis of the row space of A. Any 
r + 1 vectors of the row space of A are linearly dependent in F. 

For we may designate the rows of G by v\, . . . , v r . Our definition of G 
implies that there is no loss of generality if we permute its rows in any de- 
sired fashion. Thus, if b\v\ + . . . + b r v r = for bi in F not all zero, we 
may assume for convenience that 61 9* 0. The determinant of the r-rowed 
square matrix 

(12) P = (%) , P, = (fc, . . . , &,) , P 2 = (0, 7,-x) 



is clearly 61, and hence P is nonsingular. Then PG is r-rowed and of rank r. 
But this is impossible since biv\ + . . . + b r v r is the first row of PG. 
Hence the rows of G are linearly independent in F, they span the row space 
of G and of A, they are a basis of the row space of A. 

Assume, now, that Wk = bkiVi + . . . + bk r v r for k = 1, . . . , r + 1, so 
that Wk are any r + 1 vectors of the row space L = v\F + . . . + v r F of A. 
Define B = (bki) and obtain Wk as the fcth row of the r + 1 by n matrix 
BG. By Theorem 3.6 the rank of BG is at most r, and by Lemma 2 there 
exists a nonsingular (r + l)-rowed matrix D = (d g k) for g, k = 1, . . . , 
r + 1, such that the (r + l)st row of D(BG) is the zero vector. But then 
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dr+i, iWi + . . . + d r +i, r+iWr+i = 0, the d r +i tk form a row of the nonsingu- 
lar matrix D and cannot all be zero, the vectors 101, . . . , w r +i are linearly 
dependent in F. This proves Theorem 2. 

We now apply Theorem 1 to obtain 

Theorem 3. The matrix G of Lemma 2 may be taken to be a submatrix of A. 
The integer r of Theorem 1 is in fact the rank of the m by n matrix whose ith 
row is Ui. 

For let A be an m by n matrix whose ith row is Ui and let r be the integer 
of Theorem 1, the rank of A be s. After a permutation of the rows of A, if 
necessary, we may assume that ui, . . . , u r are a basis of the row space of A. 
Then u k = b k \ui + . . . + b kr u r for b ki in F, and k = r + 1, . . . , m. The 
matrix P given by 



(B D- 



is nonsingular, and it is clear that if 
(14) G = 



U r 



then PA is given by (11). It follows that s is the rank of G, s < r. If s < r 
we apply Lemma 2 to obtain a nonsingular r-rowed matrix D = (d gk ) such 
that the rth row of DG = 0, the rth row of D is not zero, d r iUi + . . . + 
d rr u r = contrary to our hypothesis that w 1; . . . , u r are linearly inde- 
pendent. This completes our proof. 

We may now obtain the principal result on linear subspaces of F. 

Theorem 4. Every linear subspace L over F of V n over F has a basis. Any 
two bases of L have the same number r < n of vectorsj and we shall call r the 
order of L over F. 

For V n = eiF + . . . + e n F, and by Theorem 2 any n + 1 vectors of 
V n are linearly dependent in F. Thus any linear subspace L over F of V n 
contains at most n linearly independent vectors. It follows that there 
exists a maximum number r < n of linearly independent vectors HI, . . . , u r 
in L, and that w, u\, . . . , u r are linearly dependent in F for every M of L. 
By the proof of Theorem 1 the vector u is in \Ui, . . . , u r }, L = UiF + 
. . . + u r F. But if also L = ViF + . . . + v,F, then Theorem 2 implies that 
s < r and similarly that r < s, r = s is unique. 
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In closing this section we note that the rows of the transpose A* of A 
are uniquely determined by the columns of A and are, indeed, their trans- 
poses. Thus we shall call the row space of A' the column space of A. It is 
a linear subspace of V m . We call the order of the row and column spaces 
of A, respectively, the row and column ranks of A. By Theorem 3 the rank 
of A is its row rank. Also A and A f have the same rank and we have proved 

Theorem 5. The row and column ranks of a matrix are equal to its rank. 

EXERCISES 

1. Solve Ex. 1 and 2 of Section 3 by the use of elementary transformations to 
compute the rank of the matrices whose rows are the given vectors. 

2. Form the four-rowed matrices whose rows are the vectors u\, u^, v\, v 2 of Ex. 3, 
Section 3. Show thus that LI = [HI, i^} = L 2 = {vi, v 2 ] if and only if the ranks 
of the corresponding matrices A are equal to the order of the subspace Li, that is, 
the rank of the matrices formed from the first two rows of each A. 

3. Find a basis of the row space of each of the following matrices, the basis to 
consist actually of rows of the corresponding matrix. 



c) 



1-2 10 
2-4 20 
2-3-1 1 
4-7 11 
2-3-1 1 

-1001 
0-11 1 
230-1 
5100 
1234 



2301 

-2 -1 -1 1 

-2-1 3 

04-12 

212-2 

12-1 3 4 
35102 
11 3-6-6 
47036 
1 7 -15 -16 



4. Let A be a rectangular matrix of the form 

Mi 
Ui 



where A\ is nonsingular. Show that then there exists a nonsingular matrix P such 
that 



PA 



Mi A,\ 
VO A,) 



Give also a simple form for P. Hint: Show that the rows of A* are in the row space 
of Ai. 
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5. Let the matrix of Ex. 4 be either symmetric or skew. Show that the choice 
of P then implies that 



Show also that, if the order of A\ is the rank of A, the matrix ^5 = 0. 

6. The concept of equivalence. In discussing the properties of mathemati- 
cal systems such as fields and linear spaces over F it becomes desirable 
quite frequently to identify in some fashion those systems behaving exactly 
the same with respect to the given set of definitive properties under con- 
sideration. We shall call such systems equivalent and shall now proceed to 
define this concept in terms of that of function. 

Let G and (?' be two systems and define a single-valued function / on G 
to G'. In elementary mathematics it is customary to call G the range of 
the independent variable and G f the range of the dependent variable. How- 
ever, it is more convenient in general situations to say that / is a function 
on G to G f or that / is a correspondence on G to G'. Then / is given by 

(15) g->g'=f(g) 

(read g corresponds to g'} such that every element g of G determines a unique 
corresponding element g' of (?'. In elementary algebra and analysis the sys- 
tems G and G' are usually taken to be the field of all real or all complex 
numbers and (15) is then given by a formula y = /Or). But the basic idea 
there is that given above of a correspondence (15) on G to (?'. This concept 
may be seen to be sufficiently general as to permit its extension in many 
directions. 

Suppose now that (15) is a correspondence such that every element g f of 
G r is the corresponding element f(g) of one and only one g of G. Then we 
call (15) a one-to-one correspondence on G to G'. It is clear that (15) then 
defines a second one-to-one correspondence 

(16) M = 0'-<7, 

which is now on G f to G, and we may thus call (15) a one-to-one correspond- 
ence between G and G' and indicate this by writing 

(17) g+ ><?'. 

Note, however, that, if G and G' are the same system, the functions (15) 
and (16) are, in general, distinct. Thus we may let G be the field of all real 
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numbers and (15) be the function x 2x, so that (16) is the function 
x \x. Of course, if G and G' are distinct it is not particularly important 
whether we use (15) or (16) to define our correspondence. 

We proceed to use the concept just given in constructing the fundamental 
definition of this section, that of equivalence. Let us consider two mathe- 
matical systems (?, G' of the same kind such as two fields or two linear 
spaces over a fixed field F. These systems consist of sets of elements #, A, 
. . . closed with respect to certain operations. (Thus, for example, we might 
have g + h in G for every g and h in G and also g' + h' in G' for every g' 
and h' in G'.) We then call G and G' equivalent if there exists a one-to-one 
correspondence between them which is preserved under the operations of 
their definition. We now see that we have defined two fields F and F' to 
be equivalent if there exists a one-to-one correspondence (15) between them 
such that (g + h)' = g' + h', (gh)' = g'h' for every g and h of F. Let us 
then pass to the second case which we require for our further discussion of 
linear spaces. 

Let F be a fixed field and V consist of a set of elements such that u + v 
and au are unique elements of V for every u and v of V and a of F. Then 
we shall call V a general linear space over F. If 7 is a second* such space 
and there is a one-to-one correspondence u < UQ between V and V such 
that 

(u + 0)0 = u <> + ^o , (au) = au 

for every u and v of V and a of F, then we shall say that V and V o are equiva- 
lent over F. We have thus introduced two instances of what is a very im- 
portant concept in all algebra. 

The reader should observe that under our definition every mathematical 
system G is equivalent to itself and that if G is equivalent to a system G', 
then G' is equivalent to G. Finally, if G' is equivalent to G", then G is 
equivalent to G". 

EXERCISES 

1. Verify the statement that the field of all rational functions with rational coeffi- 
cients of the complex number V2 is equivalent to the field of all matrices 

'-CD 

for rational a and 6 under the correspondence A < > a + 

* We use the notation Vo instead of V to avoid confusion in the consequent usage 
of u' for the arbitrary vector of V as well as for the transpose of u. 
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2. Verify the statement that the field of complex numbers is equivalent to the 
field of matrices 



(a -b\ 
\b a) 



for real a and 6. 

3. Verify the statement that the set of all scalar matrices with elements in a field 
F forms a field equivalent to F. 

4. Verify the statement that the mathematical system consisting of all two-rowed 
square matrices with elements in F and defined with respect to addition and multi- 
plication is equivalent to the set of all matrices 

fa n ai 2 

an a 

a* a, 

021 O22 

under the correspondence indicated by the notation. 

6. Linear spaces of finite order. We shall restrict all further study of 
linear spaces to linear spaces of order n over F. Then we define V to be a 
linear space of order n over a given field F if F is equivalent over F to V n 
over F. Clearly, our definition implies that every two linear spaces of the 
same order n over F are equivalent over F. 

Our definition also implies that, if V is a linear space of order n over F, 
then the properties (2), (3), (4), and (5) hold for every u, v, w, of V and 
a, b in F. Moreover, every quantity of V is uniquely expressible in the form 

(18) Citfi + . . . + C n t>n , 

where the equivalence between V and V n is given by 

(19) CiVi + . . . + C n V n < > (ft, . . . , C n ) . 

Conversely, define au = ua and u + v to be in V for every u, v of V and 
a of F. Then it may be shown very easily that, if (2), (3), (4), and (5) hold 
and every u of V is uniquely expressible in the form (18) for d in F, then V 
is equivalent over F to V n . However, we prefer instead to define V by its 
equivalence to V n . This preference then requires the (somewhat trivial) 
proof of 

Theorem 6. Every linear subspace L of order r over F of V n is equivalent 
over F to V r . 

Thus we justify the use of the term linear subspace L of order r over F by 
proving that L is indeed what we have called a linear space of order r over F 
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contained in the space V n . For proof we merely observe that every vector 
of L = uiF + . . . + u r F is uniquely expressible in the form 

(20) U = CiUi + . . . + CrU r 



for d in F. Thus u in L uniquely determines the d in F and conversely. 
It follows that 

(21) w->(ci, ...,<v) 

is a one-to-one correspondence between V and V r and it is trivial to verify 
that it defines an equivalence of L and V r . 

We have now seen that every linear space L of order n over F may be 
regarded as a linear subspace of a space M of order m over F for any 
m > n. Moreover, L = M if and only if m = n. 

Theorem 3 should now be interpreted for arbitrary linear spaces of order 
n, and we have a result which we state as 

Theorem 7. Let L = UiF + . . . + u n F and Vi, . . . , v m be in L so that 
there exist quantities a,- in F for which 

Vi = anUi + . . . + a in u n , 

and the coefficient matrix A = (aij) is defined. Then the number of the v k 
which are linearly independent in F is the rank of the matrix A. Moreover, 
Vi, . . . , v m form a basis of L over F if and only if m = n and A is nonsingular. 

EXERCISES 

1. Verify the statement that the following sets of matrices are linear spaces of 
finite order over F and find a basis for each. 

a) The set of all m by n matrices with elements in F. 

b) The set of all mbyn matrices whose elements not in the first row are zero. 

c) The set of all n-rowed scalar matrices. 

d) The set of all n-rowed diagonal matrices. 

2. Find bases for the following linear spaces of polynomials with coefficients in F. 
a) All polynomials in x of degree at most three. 

6) All polynomials in independent variables x and y and degree at most two 
(in x and y together). 

c) All polynomials in x = t* + t and y = J 8 + 2 and degree at most two in 
x and y. _ 

d) All polynomials in ^2, with F the field of all rational numbers. 

e) All polynomials in a primitive cube root of unity with F the field of all 
rational numbers. 
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/) All polynomials in w = u(i + 1) with F the field of all rational numbers 

and i* = 1, v? = 2. Hint: Prove that 1, w, i, m are a basis. 
g) The polynomials of (/) but with F the field of all real numbers. 

3. Let A = diag {1, 1, 2}. Show that 7, A, A* form a basis of the set of all 
three-rowed diagonal matrices. 

4. Show that 7, A, B, AB form a basis of the set of all two-rowed square mat- 
rices if 




5. Show that, if 

/O 0\ 

A = (0 -1 01, 

\0 I/ 

then 7, A, A 2 , 5, AB, A 2 , B 2 , AB 2 , A 2 J5 2 form a basis of the set of all three-rowed 
square matrices. 

7. Addition of linear subspaces. If LI = (vi, . . . , v m } and L 2 = 
{wi, . . . , w q ) are linear subspaces over F of a space L of order n over F, 
the subspace Lo = {wi, . . . , v m , w\, . . . , w q ] of L will be called the sum of 
Li and L 2 and will be designated generally by 

(22) Lo= {Li,Li} . 

If the only vector which is in both LI and L 2 is the zero vector, we shall say 
that LI and L 2 are complementary subspaces of their sum and write 

(23) Lo - L! + L, . 

In this case the order of L is the sum of the orders of LI and L 2 and in fact 
we shall show that, if LI = viF + . . . + v m F, L 2 = wiF + . . . + w q F, 
then Lo = v^ + . . . + v m F + WiF + . . . + w q F. 

For it is clear that a\v\ + . . . + OmV m + Jwi + . . . + b q w q = if and 
only if v = aiVi + . . . + a m v m = (-bi)wi + . . . + (bq)w q is in both LI 
and L 2 . Thus v ^ implies that the a* and 6,- are not all zero and therefore 
that the vectors t\- and Wj spanning L are linearly dependent in F and do 
not form a basis of L . Conversely, if necessarily t> = 0, then the v> and w, 
do form a basis of Lo, and Lo has order m + <? 

If LI and Lo are linear subspaces of L and if L contains LI, we may ask 
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whether a linear subspace L 2 of L exists such that L = LI + L 2 . The 
existence of such a space is clearly a corollary of 

Theorem 8. Let LI of order m over F be a linear subspace of L of order n 
over F. Then there exists a linear subspace L 2 of L such that LI and L 2 are 
complementary subspaces of L. 

The result above may be proved by the method we used to prove Theo- 
rem 1, where we apply this method to the set i, . . . , v m , u\, . . . , u n , in 
L = uiF + + u n F. However, let us give, instead, a proof using matrix 
theory. We put L = V n , let G be the mbyn matrix whose rows are the basal 
vectors vi, . . . , v m of LI. Then G has rank m, and there exists a nonsingular 
matrix Q such that the columns of GQ are a permutation of those of (?, 

GQ = (ft ft) , 

where ft is nonsingular. Then the matrix 

ft 



is nonsingular, and A = AoQ"" 1 is obtained by permuting the columns of A$. 
But then 

A = 

is nonsingular, the rows of A span V n , and the rows of H span the space L 2 
which we have been seeking. Moreover, it is clear that the rows of H are 
certain of the vectors e which we defined in Section 2. 

EXERCISES 

1. Let LI be the row space of each of the m by n matrices of Section 4, Ex. 3. Use 
the method above to find a basis of the corresponding V n consisting of a basis of 
Li and of a complementary space L 2 . 

2. Let the following vectors U{ span Li, v span Li. Find a complement in { Li, L 2 } 
to Li and to La. 

f ,-.(!, -1,1,1), % = (2, -2, 1, 2) , u,-(l, -1,0, 1) 
.= (1,2,1,0), i* -(4, -1,4,0) 

fin -(1,2, 0,0), .-(!, -1,1,0), u 3 = (1, 0, 0, 1) 

.* = (4, 3, 1, 1) , % = (4, -3, 3, 2) , ,% = (!, 4, 2, -3) 



/, = (1, 0, 2, -1) , , - (0, 1, 2, -1) , in - (2, 1, 6, -3) 
c) \ * = (3, -2, 2, -1) , * s = (-5, 3, -4, 2) , * - (-2, 1, -2, 



1) 
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8. Systems of linear equations. The set of all linear forms in x\, 
with coefficients in F is a linear space 



(24) L = xiF + . . . + x n F 
of order n over F. The left members 

(25) fi = fi(xi, . . . , x n ) = d 

of a system 

(26) a t iXi + . . . + a in x n = c 



+ . . . + a in x n 



(i = 1, . . . , m) , 



of m linear equations in n unknowns, are such forms and are in L. Then, 
if r is the rank of the m by n matrix A = (a,-/) of coefficients of (26), we 
see by Theorem 7 that certain r of the forms / are linearly independent in 
F and the remaining m r forms are linear combinations of these r. 

We may assume without loss of generality that the equations (26) have 
been labeled so that /i, . . . , f r are linearly independent in F, and 



(27) 

for b k j in F. Then 



/* = 6*1/1 + . . . + b kr f r (k = r + 1, . . . , m) 



(28) 



A = 



(i = 1, . . . , w) , 
where ui, . . . , u r are linearly independent in F, and 
(29) u k = bkiUi + . . . + b kr Ur (k = r + 1, . . . , m) , 



If the system (26) is consistent, there exist quantities di, . . . , d n in F such 
that/i(di, . . . , dn) = Ci. Then (27) implies that 



(30) 



C k = fk(di, . . . , d n ) = bklCi + . . . + 



(fc - r + 1, . . . , m) , 
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Define the augmented matrix A* of the system (26) to be the m by n 
matrix 



(31) 



A* 



u* 



u 



(OH, 



, a in , c 

(i = 1, . . . , m) , 
and see that (29) and (30) imply that 
(32) ul = 6*11* + . . . + b kr u* r (k = r + 1, . . . , m) , 

so that the rank of A* is at most r. But A* has A as a submatrix; A* has 
rank r. 

Conversely, if A * has the same rank r as A and we choose wi, . . . , u r to 
be linearly independent, then u* t . . . , u* are clearly linearly independent. 
We then have (29) and may apply elementary row transformations of type 
1 to A* which add (b k iU* + . . . + b kr u*) to ul for k r + 1, . . . , m. 
These replace the submatrix A of A * by 



(33) m, o 



(S)- 



U T 



(G Cl \ 
\0 C,/ 1 



and replace A * by 



(34) 



where the (m r)-rowed and one-columned matrix C 2 has c/bo = c* 
(ftjfcici + . . . + bkrCr) as its elements. But, clearly, if any c* 5^ 0, the 
matrix A* has a nonzero (r + l)-rowed minor. This is impossible if A* is 
of rank r. 

We now see that the system (26) is consistent if and only if A* has the 
same rank r as A. Moreover, we have already shown that, if A* does have 
the same rank as A, then m r of the equations (26) may be regarded as 
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linear combinations of the remaining r equations and are satisfied by the 
solutions of these equations. It thus remains to show that any system of r 
equations in xi, . . . , x n with matrix of rank r has solutions. We may write 
x = (xi, . . . , x n ) and see that such a system may be regarded as a matrix 
equation 

(35) Gx' = t/ , v = (ci, . . . , c r ) . 

Before solving (35) we prove 

LEMMA 4. Let G be an r by n matrix of rank r. Then 

(36) G = (I F 0)Q 

for a nonsingular n-rowed matrix Q. 

For Lemma 2 states that G = (H 0)Qi, where Qi is n-rowed and non- 
singular and H is an r by r matrix of rank r. Then H is nonsingular, so is 
Q 2 = diag {H, 7 n _ r }, (7 r 0)Q 2 = (H 0). Thus we have (36) for Q = 
Q 2 Qi. 

The system (35) may now be written as 

(37) (I T 0)t/' = t/, y = xQ' = ( yi , ..., yn ). 

But then y = xQ' is a nonsingular linear transformation and the t/ are 
linearly independent linear forms in Xi, . . . , x n . Evidently, (37) has the 
solution yi = c for i = 1, . . . , r; the solution of (35) is then given by 
z = y(Q')~ l for yi = Ci(i = 1, . . . , r) and 2/ r +i, - - , 2/n arbitrary. Ob- 
serve that, if we choose the notation of the x so that G = (Gi, 62) with 
Gi an r-rowed nonsingular matrix, then 



(38) Q 

where FI and X\ have r columns. From this we obtain 
Q' = (! ^J, xQ' = (X 1 G(+X t G' i 
so that 
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and our solution of (35) is given by 
(39) X l - (Y l - X,( 



But then we have solved for xi, . . . , x r as linear functions of Xr+i, . . . , x n . 
For exercises on linear equations we refer the reader to the First Course in 
the Theory of Equations. 

9. Linear mappings and linear transformations. The system of equations 
(3.1) of a linear mapping was expressed in (3.32) as a matrix equation 
x' = y'A'. Let us interchange the roles of m and n, A and A.' in this equa- 
tion. Then we see that a linear mapping may be expressed as a matrix 
equation 

(40) v = uA, 
where A is an m by n matrix and 

(41) u = (pi, . . . , y m ) , v = (xi, . . . , x w ) . 

Clearly, u is a vector of V m over F , v is a vector of V n over F, and (40) may 
be regarded as a correspondence u > uA defined by A whereby every u 
of V m determines a unique vector uA of F n . We now proceed to formulate 
this concept more abstractly. 

Let L and M be linear spaces of respective orders m and n over F and 
consider a correspondence on L to M. Designate the correspondence by the 
symbol 5, so that S is the function 

S: u*u s 

(read u goes to u upper S) wherein every vector u of L determines a unique 
u s in M . Suppose also that 

(42) (au + 6tto) 5 = au 8 + bu 

for every a and 6 of F, u and U Q of L. Then we shall call S a linear mapping 
of the space L on the space M and describe (42) as the property that S is 
linear. 

Suppose now that L = u\F + . . . + u m F and M = viF + . . . + v n F, 
so that we are given not only the spaces L and M but fixed bases as well. 
Then a linear mapping S uniquely determines u$ in M, and hence 

(43) uf = anVi + . . . + a in v n (i = 1, . . . , m) , 
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for an in F. Thus S determines also an m by n matrix A = (a 7 ). But, con- 
versely, if A is an m by n matrix and we define uf by (43) for the given ele- 
ments an of A, then the property that S is linear uniquely determines the 
mapping S. This is true since every u of L is uniquely expressible in the 
form 

(44) u = yiUi + . . . + y m u m , 
for 2/< in F and (42) implies that 

(45) u s 



It follows that to every linear mapping S of L on M and 0wen bases of L 
and M , there corresponds a unique w by n matrix A and conversely. We 
shall call A the matrix of S with respect to the given bases of L and M. 
We now observe that 




where 



- 1 

But then, if we assume temporarily that L = V m and M = V n and put 
v = u s in (40) and (41), we see that S is the linear mapping 

(46) u - u s = uA 

for the given matrix A. Thus every linear mapping which is a change of 
variable as in (3.1) may be regarded as a linear mapping of the space V m 
on the space V n . 

Let us next observe the effect on the matrix defined by a linear mapping 
of a change of bases of the linear spaces. Define new bases of L and M , 
respectively, by 

(47) 4 0) = 

i-i y-i 
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for k SB 1, . . . , m and I = 1, . . . , n. Then P = (pki) and Q = (qu) are 
nonsingular and, as we saw in (3.33), we may also write the second set of 
equations of (47) in the form 



(48) v, = 

1-1 

where JB = (r/j) = Q" 1 . We apply the linearity of S to (47) and obtain 
wW . Substituting (43) and (48), we have 



(49) 



where b k i = Spkidarji- Hence the matrix B = (bki) of the linear mapping 
S with respect to our new bases is given by 

(50) B = PAQr 1 . 

Since P and Q" 1 are arbitrary nonsingular matrices of m and n rows, re- 
spectively, we see that changes of basis in L and M replace A by an equiva- 
lent matrix. Thus any two equivalent m by n matrices define the same 
mapping of L on M . 

If L and M are the same space we shall henceforth call a linear mapping 
S of L on L a linear transformation of L. Since we are now considering only 
a single space, the only possible meaning of the Ui and v,- in (43) can be that 
of a fixed basis u\, . . . , u n of L of order n over F and of a second basis 
v\, . . . , v n of L. Let us restrict our attention to the case where v = u^ 
Then we define the matrix A of a linear transformation $ on L with respect 
to a fixed basis ui, . . . , u n of L to be the matrix A determined by (43) 
with Ui = Vi. We have defined thereby a one-to-one correspondence be- 
tween the set of all n-rowed square matrices with elements in F and the set 
of all linear transformations on L of order n over F. If L = F n , such a 
correspondence is given by 

u u s = u A . 

Clearly, we should and do call S a nonsingular linear transformation if A is 
nonsingular. Since we may then solve for u = u 8 A~ 1 ) we see that a non- 
singular linear transformation defines a one-to-one correspondence of L 
and itself. 
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We now observe the effect on A of a change of basis of L. We use (47) 
but note that u t = v+ and i4 0) = *4 0) , so that we have P = Q. Hence a 
change of basis* of L with matrix P replaces the matrix A of a linear trans- 
formation by 

(51) B = PAP- 1 . 

It is now clear that in order to study those properties of a linear trans- 
formation on L which do not depend on the basis of L over F we need only 
study those properties of square matrices A which are unchanged when we 
pass to PAP~ l . We shall call two matrices A and B similar if (51) holds 
for a nonsingular matrix P and shall obtain necessary and sufficient condi- 
tions that A and B be similar in our next chapter. 

EXERCISES 

1. Let S be the linear mapping (46) of 7 3 on F 4 defined for the following mat- 
rices A. Find the vectors of 7 4 into which (-2, 3, 4), (1, 0, 0), (0, 1, 0) of F 3 are 
mapped by S. 

I 1 -3 1\ /-2 3 4 0\ 

a) A = I 2-6-1 21 b) A = I -1 2 4) 

\-l 3 1 -I/ \ 2 -3/ 

/2 -1 2 -5\ /-2 0\ 

c) A = -2 3-2 d) A = I 0-10 01 

\1 1 -1 -I/ \-l 1 1 I/ 

2. Show that the linear transformations (46) of V* defined for the following mat- 
rices A are nonsingular and find their inverse transformations. Apply both S and 
S" 1 to the vectors of Ex. 1. 

/-4 3 5\ 

a) A = 3 20 6) A = 
V 5 -2 117 

3. Define /S for the matrices of Ex. 2 and let C be one or the other of the curves 
of all vectors (points) u = (x\, x%, x$) whose coordinates satisfy the following equa- 
tions. Find the equation of the curves C s into which each S carries each C. 




d) 3x1 - 2x1 + 2zi + 40:1X2 - 

6) -4x? + Hxl = -6x^2 - 10xix 8 - 18x2X3 + 1 

* Observe that a change of bases (47) of L defines a linear transformation of L when 
we put uf wj 0> . Thus we may regard a change of basis as being induced by a nonsingu- 
lar linear transformation. 



86 INTRODUCTION TO ALGEBRAIC THEORIES 

10. Orthogonal linear transformations. The final topic of our study of 
linear spaces will be a brief introduction to those linear transformations per- 
mitted in what is called Euclidean geometry. Let then V n be the n-dimen- 
sional linear space of vectors u = (ci, . . * , c n ) over a field F. We define 
the norm of u (square of the length of u) to be the value f(u) = /(ci, . . . , c n ) 
of the quadratic form 

(52) /fe, ...,^-^ + ...+4. 

We propose to study those linear transformations S on V n which are said 
to be length preserving and define this concept as the property that f(u) = 
f(u 8 ) for every u of V n . Such transformations S will be called orthogonal. 

We may define S by u s = uA for an n-rowed square matrix A. Then, 
clearly, f(u) = uu', f(u s ) = u s (u s )' = uAAV. Write B = AA' - (ft,,) 
and see that/(w s ) = 2Ci6i,-c/, /(w) = 2c?. Put c p = 1, c/ = for j ^ p and 
have f(u) = /(w 5 ) only if 6 = 1 for every i. Then take c p c q = 1, all 
other c/ = 0, from which f(u) = 2, f(u s ) = 2 + 26 pfl = /(w) only if 
b pq = for a p 5^ g, B = / is the identity matrix. We call matrices A 
satisfying 



(53) AA' = I 

orthogonal matrices and have shown that S is orthogonal if and only if its 
matrix is an orthogonal matrix. 

We have seen that S determines A uniquely only in terms of a fixed 
basis of F n . Now in Euclidean geometry the only changes of basis allowed 
are those obtained by an orthogonal linear transformation, that is, those 
for which the matrix P of (51) is orthogonal. But then PP 1 = 7, P' = P- 1 , 
P'P = /, so that if S is a linear transformation with orthogonal matrix A 
and we replace A by PAP"" 1 = B, then 

BB' (PAP')(PA'P') - PAA'P' = I , 
and B is also orthogonal. 

EXERCISES 

1. What are the possible values of the determinant of an orthogonal matrix? 

2. Let I be the identity matrix, A be a skew matrix such that / + A is nonsingu- 
lar. Show that (/ + A)""^/ A) is orthogonal. 

3. Show that every orthogonal two-rowed matrix A has the form 

/ a V 
(b =1 
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where a = s(s 2 + J 2 )~ 1/2 , 6 = Z(s 2 + J 2 )-*/ 2 , and t range over all quantities in F 
such that s* + t* 7* 0. Hint: The result is trivial if A is a diagonal matrix. Now 
0?i + #12= 1 so that s = an, t an are of the form above, while, conversely, 
a 2 + 6 2 = (s 2 + * 2 )(s 2 + J 2 )" 1 = 1. The values 021 = 6,022 = + a are derived from 
AA' 7. 

4. The equations for rotation of axes in a real Euclidean plane through an angle 
h are x = z cos h t/ sin h, y = z sin ft + 3/0 cos h. Show that the corresponding 
matrix is orthogonal and that every real orthogonal two-rowed matrix is either the 
matrix of a rotation or its product by the matrix 



E - vo -i 

of a reflection x = X Q , y = y Q . 

5. Find a real orthogonal matrix P for each of the following symmetric matrices 
A such that PAP' is a diagonal matrix. Hint: Compute PAP 1 = D = (d*,-) and 
put du = 0. 



\ 1N i M / 3 4> \ x A l \ j\ / 3 

a) OJ 6) (4 -3J C > (l l) ( 2 



11. Orthogonal spaces. We shall call two vectors u and v of V n orthog- 
onal (that is, in a geometric sense, perpendicular) if uv' = 0. Then, if A is 
any orthogonal matrix and we define a linear transformation S of V n by 
u s = uA, we have 



u s = wA , tr 8 wA , wW = uAA't;' = u f = 

if and only if uv' = 0. Thus orthogonal transformations on V n preserve 
orthogonality of vectors of V n . 

If L = viF + . . . + v m F is a linear subspace of F, we define the set 
0(L) to be the set of all vectors w in V n such that tw' = for every v of L. 
Then 0(L) is a linear subspace of V n which we shall call the space orthogonal 
in V n to L. For, clearly, if Wi and w 2 are in 0(L) and a and 6 are in F, we 
have 0(01^ + bw 2 )' = atwj + fawj = 0, ^i + ^2 is in 0(L). We now 
prove 

Theorem 9. Let L be a linear subspace of order m of V n . TVien the order 
of 0(L) is n m. 

In fact we shall prove 

Theorem 10. Let L be the row space of the m by n matrix G of ran/c m so 
that G = (I m 0)Q for a nonsingular n-rowed matrix Q. Then O(L) is the 
row space of H = (0 In-mXQ')" 1 o/ rank n m. 

For proof we first note that the elements of GH' are the products t^toj 
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of the rows Vi of G by the transposes of the rows Wj of H. But GH f = 
(I m 0)[(0 /n-m)], and, clearly, then GH' = 0. Let then 0(L) = yiF + 
. . + y q F of order q over F so that 0(L) contains the row space of H. 
Evidently H has rank n m, and we must have g > n m. We let HO be 
the g by n matrix whose kth row is y*, and we have GH' Q = 0, = 
(I m 0)Q#Q. The rank of the n by q matrix 



is g since Q is nonsingular. But 



so that Di = 0, D has at most n m nonzero rows, and D must have rank 
q < n m. This proves that q = n ra, the row space of H is 0(L). 

Note that the row space of H is the set of all solutions x (xi, . . . , x n ) 
of the homogeneous linear system Gx 1 = 0. 

The sum of the orders of L and 0(L) is n, and it is natural to ask whether 
or not they are complementary in V n , that is, whether V n = L + 0(L). 
This is not true in general, since if F is the field of all complex numbers and 
L = vF, i 2 = 1, v = (1, i), then vv f = 1 + i 2 = 0. Hence 0(L) is con- 
tained in L and 0(L) = L. However, we may prove 

Theorem 11. Let F be a field whose quantities are real numbers. Then 
V n = L + 0(L). 

For by Theorem 9 it suffices to prove that the only vector w in L and 
0(L) is the zero vector. Hence let w be in both L and 0(L) so that w = dG, 
where d = (di, . . . , d m ) has elements in F and G is an m by n matrix of 
rank m whose row space is L. Then Gw' = while also Gw f = GG'd f . By 
Theorem 3.17 the matrix GG' has rank m and is nonsingular, GG'd' = 
only if d 7 = 0, d = 0, w = as desired. 

EXERCISE 

Let L be the space over the field of all real numbers spanned by the following 
vectors u*. Find a basis of the space 0(L) in the corresponding V n . 

a) in- (1,2, -1,0), u, - (0, 1, 2, 1) 

6) Ul - (1, 0, 1, 1) , in = (0, 1, 0, 1) , u 3 = (-1, 2, 1, 0) 

c) ui - (1, -1, 2, 1) , u, = (2, -1, 2, 3) , u, = (1, -2, 4, 0) 

d) in- = (1, 2, -1) , ^ - f-l, 1, 0) , u 3 = (3, 3, 3) 



CHAPTER V 
POLYNOMIALS WITH MATRIC COEFFICIENTS 

1. Matrices with polynomial elements. Let F be a field and designate by 
(1) Fix] 

(read: F bracket x) the set of all polynomials in x with coefficients in F. 
We shall consider m by n matrices with elements in F[x] and define elemen- 
tary transformations of three types on such matrices as in Section 2.4. As 
we stated in that section, we assume that in the elementary transformations 
of type 2 the quantities c are permitted to be any quantities of F[x\. But 
those of type 3 are restricted so that the quantity a in F[x] shall have an 
inverse in F[x]. Then a ^ must be a constant polynomial, that is, a may 
be any nonzero quantity of F. 

We now let A and B be m by n matrices with elements in F[x] and call 
A and B equivalent in F[x] if there exists a sequence of elementary transfor- 
mations carrying A into B. The field F(x) of all rational functions of x with 
coefficients in F contains Flx] } and it is thus clear that if A and B are equiva- 
lent in F[x] they are also equivalent in F(x). Hence we see that A and B 
are equivalent in F[x] only if they have the same rank. We may then 
prove 

LEMMA 1 . Every nonzero matrix A of rank r with elements in F[x] is equiva- 
lent in F[x] to 



(S 1 X) - 



where GI = diag (fi, . . . , f r } for monic polynomials fi = fi(x) such that fi di- 
vides fi+i. 

For the elements of all matrices equivalent in Fix] to A are polynomials 
in x, and in the set of all such polynomials there is a nonzero polynomial 
/i = /i(#) of lowest degree. Using elementary transformations of types 3 
and 1, we may assume that f\ is monic and is the element in the first row 
and column of a matrix C = (c ; ) equivalent in F[x] to A. By the Division 

89 
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Algorithm for polynomials we may write CH = qifi + r for g< and r< in 
F[x] and r of degree less than the degree of f\. But if we add q> times the 
first row of C to its ith row, we pass to a matrix equivalent in F[x] to A 
with Ti as the element in its I'th row arid first column. Our definition of /i 
thus implies that r< is zero. Moreover, we have now shown A equivalent 
in F[x] to a matrix D = (da) with dn = for i 7* 1, du = f\. Similarly, 
we see that every du is divisible by /i and hence that A is equivalent in 
F[x] to a matrix 



(3) Vo 



where A 2 has w 1 rows and n 1 columns. Then either A 2 = 0, or we 
may apply the same process to A 2 . After a finite number of such steps we 
ultimately show that A is equivalent in F[x] to a matrix (2) of our lemma 
such that every / = /<(x) is monic and is a polynomial of least degree in 
the set of all elements of all matrices equivalent in F[x] to 



(4) 



Write /i+i = fiSi + ti, where s< and ti are in F[x] and the degree of ti is less 
than the degree of /i. Then we add the first row of A i to its second row so 
that the submatrix in the first two rows and columns of the result is 



(5) 



( ft 

\fi 



We then add a* times the first column to the second column to obtain 
a matrix equivalent in F[x] to Ai and with corresponding submatrix 



(6) 



//,- 
\f< 



Our definition of / t thus implies that ti = 0, / divides />i as described. 

We observe now that the elements of every <-rowed minor of a matrix A 
with elements in F[x] are polynomials in z, so that these minors are also 
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polynomials in x. By Section 2.7 if B is obtained from A by an elementary 
transformation, then every Crowed minor of B is either a Crowed minor of 
A, the product of a 2-rowed minor of A by a quantity a of F, or the sum 
MI + fMz where MI and Afa are Crowed minors of A and/ is a polynomial 
of F[x]. If d in F[x] then divides every Crowed minor of A, it also divides 
every 2-rowed minor of all m by n matrices B equivalent in F[x] to A. But 
A is also equivalent in F[x] to B and therefore the Crowed minors of equiva- 
lent matrices have the same common divisors. We may state this result as 

LEMMA 2. Let Abe an m by n matrix with elements in F[x] and d t be the 
greatest common divisor of all i-rowed minors of A. Then d t is also the greatest 
common divisor of the i-rowed minors of every matrix B equivalent in F[x] to A. 

Every (k + l)-rowed minor Mk+i of A may be expanded according to a 
row and is then a linear combination, with coefficients in F[x], of fc-rowed 
minors of A. Hence dk divides every Mk+i so that dk divides dk+i. We ob- 
serve also that in (2) the only nonzero fc-rowed minors are the fc-rowed 
minors |diag \fi v . . . ,/,-J | for ii<z 2 < . . . <i* , and since clearly every / 
divides /,+ we see that the g.c.d. of all fc-rowed minors of (2) is /i ... /*. 
We thus have d k = /i . . . fk, whence 

(7) --A. 



It is customary to relabel the polynomials / t and thus to write f r = gi, 
/ r _i = 02, /i = g r > We call g 3 the jth invariant factor of A and see that if we 
define d = 1 we have the formula 

(8) g f = (j = 1, i r) . 



Moreover, 0, is now divisible by 0/+i for j = 1, . . . , r 1. We apply ele- 
mentary transformations of type 1 to (2) and see that if A has invariant 
factors 0i, . . . , r , then A is equivalent in F[x] to 

w (o 2). -*i *> 

If, then, is equivalent in F[z] to 4, it has the same greatest common di- 
visors d k and hence the same g, of (8), while, if the converse holds, B is 
equivalent in F[x] to (9) and to A. We have proved 
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Theorem 1. Two m by n matrices with elements in F[x] are equivalent in 
F[x] if and only if they have the same invariant factors. Every m by n matrix 
of rank r with invariant factors gi, . . . , g r is equivalent in F[x] to (9). 

As in our theory of the equivalence of matrices on a field we may obtain 
a theory of equivalence in F[x] using matrix products instead of elementary 
transformations. We thus define a matrix P to be an elementary matrix if 
both P and P- 1 have elements in F[x}. Then, if a = \P\ and b = \P~ l \, 
the quantities a and 6 are in F[x], ab = | / 1 = 1, so that a and 6 are nonzero 
quantities of F. But if P has elements in F[x] so does adj P. Hence so will 
P- 1 = | PI" 1 adj P if |P | is in F. Thus we have proved that a square ma- 
trix P with polynomial elements is elementary if and only if its determinant 
is a constant (that is, in F) and not zero. 

We now observe that, in particular, the determinant of any elementary 
transformation matrix is in F. Hence, if A and B are square matrices with 
elements in F[x] and are equivalent in F[x], their determinants differ only 
by a factor in F. Moreover, if \A \ and \B\ are monic polynomials, then 
the equivalence of A and B in F[x] implies that \A \ - \B\ and, in fact, 
that when A is a nonsingular matrix with 0i, . . . , g n as invariant factors its 
determinant is the product g\ . . . g n . 

It is clear now that a square matrix P with elements in F[x] is elementary 
if and only if P is equivalent in F[x] to the identity matrix, and thus the 
invariant factors of P are all unity. We may now redefine equivalence. We 
call two m by n matrices A and B with elements in F[x] equivalent in F[x] if 
there exist elementary matrices P and Q such that PAQ = B. Then we again 
have the result that A and B are equivalent in F[x] if and only if they have 
the same invariant factors. For under our first definition P and Q are equiva- 
lent in F[x] to identity matrices and hence may be expressed as products of 
elementary transformation matrices. But, if PO and Q are elementary trans- 
formation matrices, the products P&A, AQo are the matrices resulting from 
the application of the corresponding elementary transformations to A. 
Hence, PAQ must have the same invariant factors as A. The converse is 
proved similarly, and we have the result desired. 

In closing let us note a rather simple polynomial property of invariant 
factors. The invariant factors of a matrix A with elements in F[x] and rank 
r are certain monic polynomials Qi(x) such that Qi+i(x) divides gi(x) for 
i = 1, . . . , r 1. If g k (x) = 1, then g&x) = 1 for larger j = k + 1, 
. . . , r. Let us then call those gt(x) ^ 1 the nontrivial invariant factors 
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of A } the remaining g^x) = 1 the trivial invariant factors of A. Thus there 
exists an integer t such that gi(x) has positive degree for i = 1, ...,, 
g i+ i(x) = . . . = g r (x) = 1. 

EXERCISES 

1. Express the following matrices as polynomials in x whose coefficients are 
matrices with constant elements. 



a) I 

x 3 + 3x 2 - 3x - 1 x 3 - x 2 + 4x - 2 



/ X 2 X 


X 2 


- 2x 


\ 


b) 


* 


- 


1 




1 


+ 


X 




X 2 


- 2x-3 




V 


+ 


2x 2 


- 2 


x 2 


+ 


2x 


+ 2 


X 3 


- 4x - 6; 




/ 


+ 


x 2 - 


-2x 






x 2 


-x 3 




X 3 + 


x\ 


c) 






X 3 








X 


-x 3 




X 3 




\2x 3 + x* 


- 2x 


+ 1 




X 2 


x 


1 


x + 


it 



/x 2 + x x x 2 + 1\ 

d) x 2 + x x + 1 x 2 + x 1 

\x 2 - 1 2x 2x 2 + 2/ 

2. Let A be an m by n matrix whose elements are in F[x]. Describe a process by 
means of which we may use elementary row transformations involving only the 
fth and kth rows of A to replace A by a matrix with the g.c.d. of an and a*,- in its 
ith row and jth column. Hint: Use the g.c.d. process of Section 1.6 with/ = an, 
Q = a>kj> 

3. Use elementary transformations to carry the following matrices into the 
form (9). 



(x \ /x(x-l) \ /x 2 + x-2 \ 

a) \Q x + l) b) \ x(x + l)j c) V x 2 + 2* - 3/ 



4. Describe a process for reducing a matrix A with elements in F[x] to an equiva- 
lent diagonal matrix. How, then, may we use the process of Ex. 3 to carry this 
preliminary diagonal matrix into the' form (9)? 

5. Reduce the matrices of Ex. 1 to the form (9) by the use of elementary trans- 
formations. 

6. Determine the invariant factors of the matrices of Ex. 1 by the use of (8). 
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7. Use elementary transformations to reduce the following matrices to the 
form (9). 



)l 
>l 


(x 2 1 x 2 - x 8 x - x 2 - 2 \ 
01 x 2 x - 2 t 
x 8 -x 2 + x - x 8 - x 4 1 + 2x - x 2 - x 3 1 
2x 8 1 2 + x + x 2 -2x 4 -l+x-2x 8 / 

(x 2 -2x 2x-2 x 2 -2 \ 
4x + 4 3x + 2 -3x 4x + 6 I 
X X 1 X X + 1 1 




4x 2 + 4x 


2x 2 -2x 2 + x-2 4x 2 + 


5x~2/ 




(x + 3 


3x 2 + 3x 


2x + 3 x + 3\ 






x 


2x 2 


2x x 1 




c) 


1 + x - x 3 


3x 2 + x 


2x + 1 x + 1 1 






x - x + 1 


x-2x 2 


1 - 2x 1 - x/ 






/ -* 


2x + 1 


x- 1 


x + 1 ^ 




I 2-x 


x- 1 


x + 2 


x~2 


d) 


1 


3x + 4 


x 


2x + 4 




\-X 2 + X ~ 1 


x 2 + x + 2 


x 2 + 3x - 1 


x 2 - x + 2 




/2x 2 + 4x + 2 


-x 


2x 2 + 3x + 1 


x 2 -x- 


\ 


/ x 2 + x-3 


5x 2 + 2x 


x 2 - 1 


6x 2 + 5x H 



x + 2 -x 2 x + 1 ~x 2 

2. Elementary divisors. Let K be the field of all complex numbers so that 

g\(x) = (x - ci) e i . . . (x c,) e , 

where Ci, . . . , c, are the distinct complex roots of the first invariant factor 
of a matrix A with elements in K[x], Since 0+i(x) divides 0(x), it is clear 
that every g<(x) divides 0i(x), and thus 

C\C\\ n-(r\ (r /,Vti fr / Vi TV 1 r"i 

v^iuy &%{) {*' ^i) l v- c cy * ^ i, . . . , rj . 

Here r is the number of invariant factors of A, e\j = e^ e^ ^ for i = 2, 
. . . , r. We shall call the rs polynomials 



the elementary divisors of A. Those for which e^ > will be called the 
nontrivial elementary divisors of A. 
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The invariant factors of A clearly determine its elementary divisors 
uniquely as certain rs powers // of linear functions x Cj where r is the 
rank of A, and s is the number of distinct roots of the invariant factors 
of A. Conversely, the elementary divisors of A uniquely determine its in- 
variant factors. In fact, let us consider a set of q polynomials each a power 
with positive integral exponent of a monic linear factor with a complex 
root. The distinct roots in our set may tfyen be labeled Ci, . . . , c, and our 
polynomials have the form hkj = (z c/) n *> for n^ > 0. For each c,- let tj be 
the number of hkj in our set, t be the maximum tj. Then, clearly, q = ti + 
. + t ^ tSj and our set of polynomials may be extended to a set of 
exactly ts polynomials by adjoining t tj polynomials (x c,) n / with 
HH = 0. Let us then order the t exponents n, to be integers e^ satisfying 
#ij ^ 02j ^ . . . ^ Ctj ^ 0. Define gi(x) as in (10) for i = 1, . . . , t and 
obtain a set of polynomials gi(x) such that g%+i(x) divides g^(x) for i 1, 
. . . , 2 1, the Qi(x) are the nontrivial invariant factors of a matrix A 
whose nontrivial elementary divisors are the given hkj. If A has rank r, we 
have r ^ t, and we adjoin (r t)s new trivial elementary divisors // to 
obtain the complete set of r invariant factors 0,-(z) of A. 

It is now evident that two m by n matrices with elements in K[x] are 
equivalent in K[x] if and only if they have the same elementary divisors. 

The matrix (9) has prescribed invariant factors and hence prescribed 
elementary divisors. However, it is desirable to obtain a matrix of a form 
exhibiting the elementary divisors explicitly. We shall do this. Let us prove 
first 

Theorem 2. Let f i, . . , , f s be monic polynomials of F[x] which are rela- 
tively prime in pairs. Then the only nontrivial invariant factor of the matrix 
A = diag {fi, . . . , f B } is its determinant g = f i . . . f s . 

The result is trivial for s = 1. If s = 2, the g.c.d. of the elements f\ and 
/2 of A 2 = diag {/i,/2J is unity, d\ = l,/i/2 is the only nontrivial invariant 
factor of A 2, and A 2 is equivalent in F[x] to diag (fif 2 , 1J. Assume, then, 
that A,_i = diag {/i, . . . ,/,_i} is equivalent in F[x] to 5,_i = diag {0_i, 
!,...,!}, where 0._i = /i . . . /,-i. Then A = diag {/i, ...,/,} is equiv- 
alent in F[x] to diag {0 f _i, / 1, . . . , 1}. But 0,_i is prime to / diag 
{</,_i, /,} is equivalent in F[x] to diag {g, 1}. Hence A is equivalent to 
diag {gf, 1, . . . , 1}. Then g is the only nontrivial invariant factor of A, 
and our theorem is proved. 

We see now that if gi(x) is defined by (10) for distinct complex numbers 
Cjj the corresponding elementary divisors /, . . . , /* are relatively prime 
in pairs. By Theorem 2 the matrix Ai = diag {/, ...,/} is equivalent 
in F[x] to diag {0 t -, !,...,!}. But then A = diag {^i, . . . , A t } is equiv- 



96 INTRODUCTION TO ALGEBRAIC THEORIES 

alent in F[x] to diag [g^ . . . , g tj 1, . . . , 1}, and hence the nontrivial in- 
variant factors of A are 0i, . . . , (ft- We examine the form of A to see that 
we have proved 

Theorem 3. Let Ci, . . . , c r be complex numbers and fi= (x Ci) n i/or in- 
tegers ni ^ 0. Then the matrix 

A = diag {fi, . . . ,f r j 
has the fi as its elementary divisors. 

EXERCISES 

1. The following polynomials are the nontrivial invariant factors of a matrix. 
What are its nontrivial elementary divisors? 

a) x 6 + 2x* + x\ x* + x*, x* + x 

V) x 6 + z 5 + 2x* + 2x* + x* + x, x* + x, x 

c) x(x - I) 2 (x 2 - 2x + 1), z 3 - 2z 2 + x, , 3-1 

d) x* - 5s 8 + 9z 2 - 7z + 2, z 3 - 4z 2 + 5z - 2, a; 2 - 3z + 2 

e) (* 2 - l)^, (x 2 - I) 2 , (x 2 - 1), x - 1 



2. The following polynomials are the nontrivial elementary divisors of a matrix 
whose rank is six. What are its invariant factors? 

a) (x - 1)', (x - 1), (x - 1), (x + I) 2 , (x + 1) 

b) (x - 2)', (* - 2), (x - 2), x, (x + 1), z> 

c) (x-3), (*-3), (*-3), **, x, x 

d) x, (x - 1), (x - 2), (x - 3), (x - 4), (x - 5) 

e) (x + 1)', (x + 1)2, (x - 1), x, x, *', x' 
/) x, (x - 1), x*, (x - 1)*, x', (x + 1)S x* 

3. Find elementary transformations which carry the following matrices into the 
form (9). 

/(x - 1) \ /x 3 \ 

o)( x-2 0) 6)(0x + l 0) 

\ x - I/ \0 x + 2/ 

/x a \ 

c) (0 (x-l) 
\0 x + I/ 
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3. Matric polynomials. Let the elements a*, of an m by n matrix A be 
in F[x] and let s be a positive integer not less than the degree of any of the 
a,-/. Then every a# has s as a virtual degree and we may write 



(11) Oif 

for a$ in F. Define A k = (a^) and obtain an expression of A in the form 

(12) f(x) = A Q x> + . . . + A. 

for m by n matrices A k . Thus f(x) is a polynomial in x of virtual degree 
s with m 6y n matrix coefficients Ak and virtual leading coefficient AQ> 
Moreover, we say that/(x) has degree s and leading coefficient Ao if ^o ^ 0. 

In order to be able to multiply as well as to add our matrices we shall 
henceforth restrict our attention to n-rowed square matrices with elements 
in F[x] and thus to the set of all polynomials in x with coefficients n-rowed 
square matrices having elements in F. Let us call these polynomials n- 
rowed matric polynomials. If all the A k in (11) are zero matrices, the poly- 
nomial f(x) is the zero polynomial, and we again designate it by 0. Evident- 
ly we have 

LEMMA 3. The degree of f (x) + g(x) is not greater than the degree of f (x) 
or g(x). 

LEMMA 4. Let f (x) have degree n and leading coefficient Ao, g(x) have de- 
gree m and leading coefficient B such that AoB 5^ 0. Then the degree of 
f (x)g(x) is m + n and the leading coefficient of f (x)g(x) is AoB . 

As in Chapter I we use Lemma 4 in the derivation of the Division Algo- 
rithm for matric polynomials which we state as 

Theorem 4. Let f (x) and g(x) be n-rowed matric polynomials of respective 
degrees s and t such that the leading coefficient of g(x) is nonsingular. Then 
there exist unique polynomials q(x), Q(x), r(x), R(x) such that r(x) and R(x) 
have virtual degree t 1 and 

(13) f (x) = q(x)g(x) + r(x) = g(x)Q(x) + R(x) . 

Moreover, if s < t, then q(x) = Q(x) = 0, while if s ^ t, then q(x) and 
Q(x) have degree s t. 

While the proof is very much the same as that in Theorem 1.1, let us 
give it in some detail. We assume first that s ^ t and let A ^ and J?o be 
the respective leading coefficients of /(#) and g(x). Then B^ 1 exists, and if 
q^x) = A^B^x'"' the polynomial fi(x) = f(x) q\(x)g(x) has virtual de- 
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gree 5 1. This implies a finite process by means of which we begin with 
a polynomial /i(x) of virtual degree s i * t and leading coefficient A^ 
and then form /i+i(x) = fi(x) q i+ i(x)g(x) of virtual degree s i 1 
for <7t+i(x) = A^jB^x*""*""'. The process terminates when we obtain an 
//(x) of virtual degree t 1. Thus we have the first equation of (13) with 
q(x) - 0, r(x) = /(x) if s < t, and otherwise with q(x) of degree s t and 
leading coefficient A Q B^ 1 and r(x) the //(x) above. If also /(x) = q$(x)g(x} + 
r (x) for #o(#) q(x) ^ 0, the degree of this polynomial is h jg 0, and its 
leading coefficient is C ^ 0. Also C Bo 5^ since B is nonsingular. But 
then by Lemma 4 the degree of [qo(x) g(x)]0(x) = r(x) r (x) is t + ft, 
whereas its virtual degree is t 1, a contradiction. This proves the unique- 
ness of g(x) and r(z). The existence and uniqueness of Q(x) and R(x) is 
proved in exactly the same way except that we begin by forming /(x) 



Let us regard the first equation of (13) as the right division of f(x) by 
g(x), the second as the left division of f(x) by g(x). Then we shall speak 
correspondingly of q(x) and r(x) as right quotient and remainder, of Q(x) 
and 12 (x) as left quotient and remainder. If r(x) = 0, we have /(#) = 
q(x)g(x), and 0(x) is a ngrfa divisor of /(x). Similarly, we call (/(x) a left 
divisor of 0(x) if /(x) = gr(x)Q(x), so that J?(x) = in (13). 

It is natural now to try to prove a Remainder Theorem for matric poly- 
nomials. However, the theorem in usual form is ambiguous since, for ex- 
ample, if C is an n-rowed square matrix and/(x) = A x 2 = x 2 A = xA x, 
the polynomial /(C) might mean any one of A Q C 2 , C 2 A , CAoC, and these 
matrices might all be different. Thus we must first define what we shall 
mean by/(<7). We shall do this and obtain a Remainder Theorem which we 
state as 

Theorem 5. Let f(x) be an n-rowed square matric polynomial (12), C be 
ann-rowed square matrix with elements in F. Define fn(C) (read: f right of C), 
and f L (C) (read: f left of C) by 



(14) f R (C) = A C- + A^*- 1 + . . 
and 

(15) f L (C) = CAo +0-^1 + . . . +CA_ 1 + A. . 



Then the right and left remainders on division of f (x) by xl C are fn(C) 
and fi/C), respectively. 
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To make our proof as in the case of polynomials with coefficients in a 
field we use Theorem 4 with g(x) = xl C and have }(x) = q(x)g(x) + 
r(x) where q(x) has degree s 1 and r(x) = B has elements in F. We then 
wish to draw our conclusion from /(C) = q R (C)(C C) + B. This state- 
ment is correct, but let us examine it more closely. We write q(x) = 
Cox*~ l + . . . + C,_i and have 

*(*) = q(x)(xl - C) 

= (Co* + da-' + . . . + C._!x) - (CoC*-' + . . . + C_!C) . 



Then, if D is any n-rowed square matrix with elements in F, we have 

h R (D) = CoZ> + (Ci - CoC)!)- 1 + . . . + (C_i - C._ 2 C)D - C_i 
while 
q*(D)(DI - C) = C D- + (CiD- 1 - CoD-'C) + . . . 



and these matrices are equal in general* if and only if D and C are commu- 
tative. They are equal if D = C, and thus f(x) = h(x) + B implies that 
/(C) = h R (C) +B = q R (C)(C - C) + B = B. The second part of our 
theorem is proved similarly. 

As a consequence of the result just proved we have the Factor Theorem 
for matric polynomials which we state as 

Theorem 6. The matric polynomial f (x) has xl C as a right divisor if 
and only if f(C) = 0; it has xl C as a left divisor if and only if f/;(C) = 0. 

For Theorem 5 implies that, if /#(C) = 0, then in (12) the polynomial 
r(x) = 0, f(x) = q(x)(xl - C). Conversely, if f(x) = q(x)(xl - C), we 
have seen that/ fi (C) = fe(C)(CJ - C) = 0. The results on the left follow 
similarly. 

Our principal use of the result above is precisely what is usually called 
the trivial part of Theorem 5, that is, if /(re) has x C as a factor, then 
C is a root of f(x). However, it is nontrivial that /(C) = and follows 
only from the study above where we showed that if D is any square matrix 
such that DC = CD, then h(x) = q(x)(xl - C) implies that h R (D) = 
q R (D)(D - C). 

* E.g., if q(x) - x so that h(x) = x 8 - Cx, then h R (D) = D 2 - CD, q M (D)(D - C) - 
D' - DC * D - CD unless DC = CD. 
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EXERCISES 

1. Express the following matrices as polynomials /(x) and g(x) with matric coeffi- 
cients and compute q(x), Q(x), r(x), R(x) of (13). 



2z 4 - x* + 2 -x 8 + x - 1 




(1 O'rS __ ff Q/2 _l 

X t iJU JU \jJ(/ |" 

3 4 y? - 1 x 2 i 

0*2 + x 3a .2 x 2 

/v.2 2^2 T 3 5 



2. Use /(#) of Ex. l(c) and find /ft(C) and/z,(C) by the use of the division process 
as well as by substitution if 

/O 1 0\ /O 2 0\ /I 2 0\ 
a)C=(0 l) 6)C=(l c) (7=12-1 
\2 O/ \0 I/ \0 I/ 

4. The characteristic matrix and function. If f(x) is a matric polynomial 
(12) with n-rowed scalar matrix coefficients A^, then we shall call/(x) a 
scalar polynomial Thus A k = a k l for the a^ in F, and 

(16) /(x) = (a<& + ...+ a.)I , 

where / is the n-rowed identity matrix. We call f(x) monic if a = 1. If 
now g(x) is also a scalar polynomial, the quotients q(x) and Q(x) in (13) 
are the same scalar polynomials and also r(x) = R(x) is scalar. For obvi- 
ously this case of Theorem 4 is now the result of multiplying all the poly- 
nomials of Theorem 1.1 by the n-rowed identity matrix. 
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If A is any n-rowed square matrix, the polynomials //?(A) and/i,(A) are 
equal for every scalar polynomial /(x), and we shall designate their com- 
mon value by 

(17) /(A) = aoA + . . . + a,_iA + aj . 

We now say that either the polynomial /(x) or the equation /(x) = has A 
as a root if /(A) = 0. By Theorem 6 the matrix A is a root of /(x) if and 
only if /(x) has xl A as either a right- or a left-hand factor. 

We shall call the matrix xl A the characteristic matrix of A, the de- 
terminant \xl A | the characteristic determinant of A, the scalar poly- 
nomial 

(18) f(x) = \xl - A\ I 

the characteristic function of A, and the corresponding equation /(x) = 
the characteristic equation of A. We now apply (3.27) to xl A to obtain 

(19) (xl - A)[adj (xl - A)] = [adj (xl - A)](xl - A) 

= |x/- A | /. 

Then the elements of adj (xl A) are the cof actors of the elements of 
xl A, and adj (xl A) is a matric polynomial (in general, nonscalar). 
By the argument above we have 

Theorem 7. Every square matrix is a root of its characteristic equation. 

The g.c.d. of the elements of adj (xl A) is clearly the polynomial d n -i(x) 
defined f or xl A and d n (x)= \xl A \. But then adj (x/ A) =d n _i(x)B(x), 
where B(x) is an n-rowed square matrix with elements in F[x\. Hence B(x) 
is a matric polynomial. The invariant factors of xl A were defined in (8), 
and by (19) \xl - A\I = gi(x)d-i(x)I = (xl - A)B(x)d n _i(x). By the 
uniqueness of quotient in Theorem 4 we have 

(20) g(x) = 0i(x)7 = (xl - A)J5(x) . 

Hence, clearly, g(A) = 0. Observe also that the g.c.d. of the elements of 
B(x) is unity so that if B(x) = Q(x)q(x) for a monic scalar polynomial 
q(x), then q(x) = 1. 

We now define the minimum function of a square matrix A to be the 
monic scalar polynomial of least degree with A as a root. The remark just 
made above then implies 

Theorem 8. Let gi(x) be the first invariant factor of the characteristic ma- 
trix of A. Then g(x) = gi(x)I is the minimum function of A. 
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For if h(x) is the minimum function of A we may write g(x) = h(x)q(x) + 
r(x) for scalar polynomials h(x) and r(x) such that the degree of r(x) is less 
than that of h(x). But&(A) = 0,and0(A) = Osothatr(A) = 0,r(z) = 0. 
Hence g(x) = h(x)q(x), and since g(x) and h(x) are monic so is q(x). By 
Theorem 6 we have h(x) = (xl A)Q(x), and by (20) we have g(x) = 
(xl A)Q(x)q(x) = (x/ 4)5(z). The uniqueness in Theorem 4 then 
states that B(x) = Q(x)q(x), q(x) = 1, from which g(x) = h(x) as desired. 

We see now that | xl A \ is a monic polynomial of degree n and is not 
zero, r = m = n in (9), and 

(21) P(x) (si - A) . Q(x) = diag {jfi, . . . , g n ] 

for elementary matrices P(x) and Q(x). But then, as we have already ob- 
served in an earlier discussion, 

(22) c- \xI-A\ d = g l ...g n 

for c = |P (a:) | and d = \Q(x)\ in F. Hence cd = 1, and we have proved 
Theorem 9. jPAe characteristic function of a square matrix A is /&# product 
by I o/ tfta product of the invariant factors of the characteristic matrix of A. 

This result implies that \xl A\ is the product of g\(x) by divisors 
gi(x) of g\(x}. It follows that every root of \xl A \ in any field K con- 
taining F is a root of g\(x). But in fact we have already seen that if F is 
the field of all complex numbers, the elementary divisors of xl A are 
polynomials (x c/)*w whose product is \xl A\. Then the c, are the 
distinct roots of g\(x) as well as of | xl A \ . They are called the charac- 
teristic roots of A. 

In closing this section we note that if we write f(x) = \xl A\ / = 
(x + aa"- 1 + . . . + a*) /, then/(0) = | -A \ 7 = (-l) n |A | /, so that 
\A\ = (-l) n a n - It follows that, if \A \ ^ 0, then 

(23) A"* = -o^ l (A^ + M- 2 + . . . + a n -! /) , 

and hence it is obvious that A A~ l = A- 1 A. Moreover, if \A\ =0, then 
A~~ l does not exist. If, then, A 7* and gi(x) = x m + bix m ~ l + . . . + fc m , 
the polynomial gf(a:) = g\(x)I can be the minimum function of A only if 
b m = 0. Since A 9* 0, we have m > 1, and G = A m ~ l + M m ~ 2 + . . . + 
6 m _i/ is a nonzero matrix with the property 

(24) AG = GA = . 
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EXERCISES 

1. When is the minimum function of a matrix linear? 

2. What, then, are the minimum functions of the following matrices? 

. /2 3\ ,. /O 1\ /2 3\ /o 0\ 

> (l -2J 6 ) (6 a) e > ( 4 l) (o 6J 

3. Let A = diag {^4i, A^} where 4i and A 2 are square matrices of m and n rows 
respectively. Show that if f(x) is any polynomial in F[x] and we define f t (x) = 
f(x)I t for any t, then/m+nCA) = diag {fm(Ai),f n (A 2 )}. Hint: Prove first by induc- 
tion that A k = diag [Af, A%\. 

4. Let A have the form of Ex. 3 and let g(x)I m ^ n , gi(x)I m , and Qi(x)I n be the 
respective minimum functions of A, A\, A* Prove that g(x) is the least common 
multiple of g\(x) and gi(x). 

5. Apply Ex. 4 in the case where A\ is nonsingular and A z = 0. 

6. Compute the characteristic functions of the following matrices. 








7. It may be shown that the characteristic function f(x) = x n a\x n ~ l + 
. . . + ( l)*atX n ~* + + ( l) n a n of an n-rowed square matrix A has the 
property that di is the sum of all i-rowed principal minors of A. Verify this for the 
matrices of Ex. 6. 

5. Similarity of square matrices. We have defined two n-rowed square 
matrices A and B with elements in a field F to be similar in F if there exists 
a nonsingular matrix P with elements in F such that 

PAP- 1 = B . 

The principal result on the similarity of square matrices is then given by 
Theorem 10. Two matrices are similar in F if and only if their charac- 

teristic matrices have the same invariant factors. 

For if PAP- 1 = B, then P(xl - A)P~ l == xPIP~ l - B = xl - B. 

But P has elements in F, \P\ ^ is in F, P is elementary. Hence xl A 
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and xl B are equivalent in F[x] and have the same invariant factors. 
Conversely, let xl A and xl B have the same invariant factors, so that 

(25) P(x)[xl - A]Q(x) = xl - B 
for elementary matrices P(x) and Q(x). We define 

(26) P = P L (B) , Q = Q a (B) 
as in Theorem 5 and have 

(27) P(x) = (xl - B)P*(x) + P , Q(x) = Qo()(x7 - B) + Q . 
Then 

P(x)[(xl - A)]Q(x) = (xl - B)P(x)(xI - 



P (xl - 4) Q (z)(z7 - B) + P (xl - A) - Q = xl - 



We now use (25) and the fact that P(x) and Q(x) are elementary to write 
[P(z)]- 1 = C(x), Q(x)~ 1 = D(x) for matrix polynomials C(x) and D(x) 
such that 



(28) (J - A)Q(x) = C(x)(a;I - B) , P 
But then from (27) and (28) 

P . (xl - A) = (s/ - J3)[Z)(a;) - P*(x)(xl - A)] 
and thus 

(29) (x/ - J5) - P (si - A) Q = (x7 - 



where R = fl(a?) = P (x)C(x) + D(x)Q Q (x) - Pt(x)(xI-A)Q Q (x). By Lem- 
ma 4 the degree in x of the right member of (29) is at least two unless 
R(x) = 0. But the degree in x of the left member of (29) is at most one, 
R(x) = 0, 

(30) P (xl ^ A) - Q = xl - B . 

It follows that PAQ = B, P7Q = 7, Q = P- 1 , PAP" 1 = B as desired. 

Observe that the degree of j xl A \ is n and hence that if n is the de- 
gree of the ith nontrivial invariant factor gt(x) the property | xl A | = 
. . . g t (x) implies that 



(31) HI + n 2 + . . . + n t = n , tti ^ n 2 ^ . . . ^ n t = . 

Obviously, this is an important restriction on the possible degrees of the in- 
variant factors of the characteristic matrix of an n-rowed square matrix. 
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EXERCISES 

1. What are all possible types of invariant factors of the characteristic matrices 
of square matrices having 1, 2, 3, or 4 rows? 

2. Give the possible elementary divisors of such matrices. 

3. Use the proof of Theorem 10 to show that if AI, A^ Bi, and Bz are n-rowed 
square matrices such that A\ and B\ are nonsingular, then A\x + A 2 and E\x + B 2 
are equivalent in F[x] if and only if there exist nonsingular matrices P and Q with 
elements in F such that PA& = Bi and PA 2 Q = B 2 . Hint: Take A = -Ar 1 ^, 
B BT^ in (25). 

4. Show that the hypothesis that A\ is nonsingular in Ex. 3 is essential by proving 
that Aix / and B\x J are equivalent hi F[x], yet P and Q do not exist, if 



B l 



6. Characteristic matrices with prescribed invariant factors. If g\(x), 

. . . , g t (x) are the nontrivial invariant factors of a matrix xl A and n is 
the degree of 0(z), then by Theorem 1 the n-rowed square matrix 

B = diag {Bi, . . . , B t ] 

will be similar in F to A if Bi is an n-rowed square matrix such that the 
only nontrivial invariant factor of the characteristic matrix of Bi is g>(x). 
For then xl B = diag {xl ni BI, . . . , xl nt B t ] is equivalent in 
F[x] to diag {G f i, ...,(?}, where Gi = diag {#, !,...,!}, and we con- 
clude that xl B has the same invariant factors as xl A . Thus the problem 
of constructing an n-rowed square matrix A whose characteristic matrix has 
prescribed invariant factors is completely solved by the result we state as 
Theorem 9. Let g(x) = x n (bix*- 1 + . . . + b n ) and 





(32) 



010 
00 1 

000 
b n b n _i b n 








Then g(x)I is both the characteristic and the minimum function of A, g(x) is 
the only nontrivial invariant factor of xl A. 
For 

x -1 ... 

x -1 ... 
(33) 

... x -1 

62 x 61 
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The complementary submatrix of the element & n of (33) is an (n 1)- 
rowed square triangular matrix with diagonal elements all 1 and its de- 
terminant is (-I)"- 1 . Thus the cofactor of -6 n is (-l) n + 1 (-l) n ~ 1 = 1 
and hence d n -\(x) = 1. It remains to prove that d n (x) = \xl A\ = g(x). 
This is true if n = 1, since then A = A\ = (61), and \xl A \ = x bi. 
Let it be true for matrices A n _i of the form (33) and of n 1 rows, so that 
\xl A n -i\ = z"" 1 (bix n ~ 2 + . . . + &n-i) is the cofactor of the element 
x in the first row and column of (33). We now expand (33) according to its 
first column and obtain \xl A\ = x[x n ~ l (b\x n ~ 2 + . . . + 6 n -i)] 
b n = g(x) as desired. This proves our theorem. 

The construction of square matrices A with complex number elements 
whose characteristic matrices have prescribed elementary divisors has a 
simple solution, and we shall see that the argument preceding Theorem 9 
reduces the solution to the proof of 

Theorem 10. Let c be a complex number, A be the n-rowed square matrix 



(34) 



c 1 
c 1 

000 
000 







c 1 

c 



Then the only nontrivial invariant factor of xl A is (x c) n . 

For xl A is a triangular matrix with diagonal elements all x c, 
\xl A\ = (x c) n . The complementary minor of the element in the 
nth row and first column of xl A is a triangular matrix with diagonal 
elements all unity, d n -i(x) = 1, and d n (x) = (x c) n is the only nontrivial 
invariant factor of A. 

Thus if Ci, . . . , c t are complex numbers and ni, . . . , n t are positive in- 
tegers, we construct matrices Aj of the form (34) for c = c, and with rc ; 
rows. The matrix A = diag {Ai, . . . , A t ] then has n rows, and its char- 
acteristic matrix xl A = diag {Bi, . . . , B t }, where Bi = xl ni Ai is 
equivalent in F[x] to diag {/, !,...,!} such that / = (x c t -) n . But 
then xl A is equivalent in F[x] to diag {/i, ...,/,!,...,!}, and by 
Theorem 3 the nontrivial elementary divisors of xl A are /i, ...,/* as 
desired. 
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EXERCISES 

1. Compute the invariant factors and elementary divisors of the characteristic 
matrices of the following matrices. 




-6 2 -5 -19\ 
2 1 5 1 

-2 1 0-5 
3-12 9^ 

2. Find a matrix B = diag {Bi, . . . , B t ] similar in F to each of the matrices of 
Ex. 1, respectively, where Bi has the form (32), and the characteristic function of Bi 
is the ith nontrivial invariant factor of A. 

3. Solve Ex. 2 with the characteristic function of Bi now the ith nontrivial ele- 
mentary divisor of A, Bi of the form (34). 

7. Additional topics. There are many important topics of the theory of 
matrices other than those we have discussed, and we leave their exposition 
to more advanced texts. Let us mention some of these topics here, however. 

The quantities of the field K of all complex numbers have the form 

c = a + bi (a, b in R, i 2 = 1) , 

where R is the field of all real numbers, a subfield of K. The complex con- 
jugate of c is 

c = a bi , 

and the correspondence c < > c defines a self-equivalence or automorphism 
of K, a fact verified in Section 6.9. This automorphism of K leaves the 
elements of its subfield R unaltered, that is, a = a for every real a. We 
then may call it an automorphism over jR. 
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If A is any m by n matrix with elements a a in K, we define A to be the 
m by n matrix whose element in its ith row and jth column is a,-/. It is then 
a simple matter to verify that 

Z5 = IS , (J)' = A 7 , R = ()-* , 

for every A, B, and nonsingular C. Moreover, then (JHB)' = WS 7 . We 
now define a matrix A to be Hermitian if JF = A, skew-Hermitian if JF = 
A. Two matrices A and B are said to be conjunctive in K if there exists a 
nonsingular matrix P with elements in K such that 



The results of this theory are almost exactly the same as those of the theory 
of congruent matrices, and it is, in fact, possible to obtain a general theory 
including both of the theories above as special cases. 

Two symmetric matrices A and B with elements in a field F are said to 
be orthogonally equivalent in F if there exists an orthogonal matrix P with 
elements in F such that PAP' = B. But PP' = P'P = / so that B = 
PAP~ l is both similar and congruent to A. Analogously, we call a matrix P 
with complex elements such that PP 7 = P 7 ? = /, a unitary matrix. Then 
we say that two Hermitian matrices A and B are unitary equivalent if 
PAP 7 #, where P is a unitary matrix. Both of these concepts may also 
be shown to be special cases of a more general concept. 

Finally, let us mention the topic of the equivalence and congruence of 
pairs of matrices. Let A, B y C, D be matrices of the same numbers of rows 
and columns. Then we call the pairs A, B and C, B equivalent* pairs if 
there exist nonsingular square matrices P and Q such that simultaneously 
PAQ = C and PBQ = D. Similarly, if A, B, C, D are n-rowed square 
matrices, we call A, B and C, D congruent pairs if simultaneously PAP' = 
C and PBP f = D for a nonsingular matrix P. 

References to treatments of the topics mentioned above as well as others 
will be found in the final bibliographical section of Chapter VI. We shall 
not state any of the results here. 

* In this connection see Exs. 3 and 4 of Section 5. 



CHAPTER VI 
FUNDAMENTAL CONCEPTS 

1. Groups. These pages were written in order to bridge the gap between 
the intuitive function-theoretic study of algebra, as presented in the usual 
course on the theory of equations, and the abstract approach of the author's 
Modern Higher Algebra. The objective of our exposition has now been at- 
tained. For our study of matrices with constant elements led us naturally 
to introduce the concepts of field, linear space, correspondence, and equiva- 
lence, and we are ready now to begin the study of abstract algebra. How- 
ever, we believe it desirable to precede the serious study of material such 
as that of the first two chapters of the Modern Higher Algebra by a brief 
discussion of this subject matter, without proofs (or exercises). We shall 
give this discussion here and shall therewith not only leave our readers with 
an acquaintance with the basic concepts of algebraic theory but with a 
knowledge of how these concepts may lead into those branches of mathe- 
matics called the Theory of Numbers and the Theory of Algebraic Numbers. 

Our first new concept is that of a set G of elements closed with respect 
to a single operation, and we wish to define the concept that G forms a 
group with respect to this operation. It should be clear that if we do not 
state the nature either of the elements of G or of the operation, it will not 
matter if we indicate the operation as multiplication. If we wish later to 
consider special sets of elements with specified operations we shall then re- 
place "product" in our definition by the operation desired. Thus we shall 
make the 

DEFINITION. A set G of elements a, b, c, . . . is said to form a group with 
respect to multiplication if for every a, b, c of G 

I. The product ab is in G; 
II. The associative law a (be) = (ab)c holds; 
III. There exist solutions x and y in G of the equations 

ax = b , ya = b . 

The reader is already familiar with the groups (with respect to ordinary 
multiplication) of all nonzero rational numbers, all nonzero real numbers, 
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and, indeed, all nonzero elements of any field. These are all examples of 
groups G such that for every a and 6 of G we have 

IV. The products ab = ba . 

Such groups are called commutative or abelian groups. An example of a 
nondbelian (noncommutative) group is the group, with respect to matrix mul- 
tiplication, of all nonsingular n-rowed square matrices with elements in a 
field. 

Every group G contains a unique element e called its identity element , 
such that for every a of G 

(1) ae = ea = a . 

Moreover, every element a of G has a unique inverse a" 1 in G such that 

(2) aar 1 = ar l a = e . 

Then the solutions of the equations of Axiom III are the unique elements 

(3) x = ar l b , 



A set H of elements of a group G is called a subgroup of G if the product 
of any two elements of H is in H, H contains the identity element of G 
and the inverse element h~ l of every h of H. Then H forms a group with 
respect to the same operation as does G. 

The equivalence of two groups is defined as an instance of the general 
definition of equivalence which we gave in Section 4.5. The concept of 
equivalence of two mathematical systems of the same kind as well as the 
concept of subsystem (e.g., subgroup of a group, subfield of a field, linear 
subspace over F of a linear space over F) are two concepts of evident funda- 
mental importance which are given in algebraic theory whenever any new 
mathematical system is defined. 

The number of elements in a group G is called its order. This number is 
either infinity, and we call G an infinite group; or it is a finite number n, and 
we call G a finite group of order n. For finite groups we have the important 
result which we shall state without proof. * 

LEMMA 1. Let H be a subgroup of a finite group G. Then the order of H 
divides the order of G. 

A simple example of a finite abelian group is the set of all nth roots of 
unity. The reader may verify that an example of a finite nonabelian group 

* The proof is given in Chapter VI of my Modern Higher Algebra. 
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is given by the quaternion group of order 8 whose elements are the two- 
rowed matrices with complex elements, 

f T A / 0\ D /O -1\ 

(4) ' Mo-;)' B -(i o)' 

[AB, -i, -A, -B, -AB. 

The set of all powers 

(5) a = e, a, or 1 , a 2 , cr 2 , . . . 

of an element a of a group G forms a subgroup of G which we shall desig- 
nate by 

(6) [a] 

and shall call the cyclic group generated by a. Its order is called the order of 
the element a and it can be shown that either all the powers (5) are distinct 
and a has infinite order, or a has finite order ra, and [a] consists of the m 
distinct powers 



'TO 1 



(7) e, a, a 2 , . . . , a 1 

where e is the identity element of [a], a m = e. Then the order m of a is the 
least integer t such that a* = e. Moreover, it can be shown that a* = e if 
and only if m divides t. 

The order m of an element of a finite group G divides the order of G, 
since m is the order of the subgroup [a], and we may apply Lemma 1. Thus 
n = mq, a n = (a m ) q = e q = e. We therefore have 

LEMMA 2. Let e 66 2/ie identity element of a group G o/ order n. Tften 

(8) a n = e 
/or et;en/ a of G. 



2. Additive groups. In any field and in the set of all n-rowed square mat- 
rices there are two operations. Thus we have said that the set of all nonzero 
elements of a field and the set of all nonsingular n-rowed square matrices 
form multiplicative groups. But the elements of any field, the set of all 
m by n matrices with elements in a field, the elements of any linear space, 
all form additive groups, that is, groups with respect to the operation of 
addition as defined for each of these mathematical systems. The reader will 
observe that the axioms for an additive abelian group G are those axioms 
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for addition which we gave in Section 3.12 for a field. Additive groups are 
normally assumed to be abelian, that is, the use of addition to designate the 
operation with respect to which a group G is defined is usually taken to 
connote the fact that G is abelian. 

The identity element of an additive group is usually called its zero 
element, that is, the element such that a + = a = + a. The in- 
verse with respect to addition of a is designated as a and is such that 
a + ( a) = ( a) + a = 0. Thus the solutions of the additive formula- 
tion 

(9) a + x = b , y + a = b , 
of the equations of our group Axiom III are 

(10) z = (-a) + 6, y = 6 + (-a). 

When G is abelian, we have x y and designate their common value by 
b a. Thus we define the operation of subtraction in terms of that of addi- 
tion. 

In a cyclic additive group [a] the elements are always designated by 

(11) 0, a, -a, 2- a, -(2 a), . . . , 

where, clearly, if m is any positive integer (m a) = m ( a), and we 
define ( ra) a = (m a). Here m a does not mean the product of a 
by the positive integer m but means the sum a + . . . + a with m sum- 
mands. If [a] is a finite group of order m, the elements of [a] are 0, a, 2a, 
. . . , (m 1) a, and m is least positive integer such that the sum of m 
summands all equal to a is zero. However, if [a] is infinite, then it may be 
seen that n a = q a for any integers n and q if and only if n and q are 
equal. 

3. Rings. The set consisting of all n-rowed square matrices with elements 
in a field F is an instance of certain type of mathematical system called a 
ring. Many other systems which are known to the reader are rings and we 
shall make the 

DEFINITION. A ring is an additive abelian group of at least two distinct 
elements such that, for every a, b, c, of R, 

I. The product ab is in R; 
II. The associative law a(bc) = (ab)c holds; 
III. The distributive laws a(b + c) = ab + ac, (b + c)a = ba + ca hold. 
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We leave to the reader the explicit formulation of the definitions of sab- 
ring and equivalence of rings. They may be found in the first chapter of the 
Modern Higher Algebra. The reader should also verify that all nonmodular 
fields are rings, the set of all ordinary integers is a ring. 

The zero element of a ring R is its identity element with respect to addi- 
tion. Observe that by making the hypothesis that R contains at least two 
elements we exclude the mathematical system consisting of zero alone from 
the systems we have called rings. 

Rings may now be seen to be mathematical systems R whose elements 
have all the ordinary properties of numbers except that possibly the prod- 
ucts ab and ba might be different elements of R, the equations ax = b of 
ya = b might not have solutions in R if a ^ 0, b are in R. A ring may also 
contain divisors of zero, that is, elements a 9* 0, c ^ such that ac = 0. 
In particular, the ring of all n-rowed square matrices has already been seen 
to have such elements as well as the other properties just mentioned. 

A ring is said to possess a unity element e if e in R has the property 
ea = ae = a for every a of R. The element e then has the properties of the 
ordinary number 1 and is usually designated by that symbol. The unity 
element of the set of all n-rowed square matrices is the n-rowed identity 
matrix, and the unity element was always the number 1 in the other rings 
we have studied. However, the set of all two-rowed square matrices of the 
form 



(12) 



/O r\ 

\o o; 



with r rational may easily be seen to be a ring without a unity element. In 
fact, all nonzero elements of this ring are divisors of zero. 

A ring R is said to be commutative if ab = ba for every a and 6 of R. The 
ring of all integers is a commutative ring, the ring of all n-rowed square mat- 
rices with elements in a field F is a noncommutative ring. 

4. Abstract fields. If R is any ring, we shall designate by 

R* 

* 

the set of all nonzero elements of R. Then we shall call R a division ring if 
R* is a multiplicative group. This occurs clearly if and only if the equations 
ax = 6, ya = b have solutions in R* for every a and 6 of R*. The set of all 
n-rowed square matrices is not a division ring. However, let c and d range 
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over all complex numbers, 5 and d be the complex conjugate of c and d. 
Then the set Q of all two-rowed matrices 



(13) A = ft * 

\a c 



is a noncommutative division ring. The reader should verify this, noting 
in particular that every A ^ is nonsingular since A 9^ implies that 
c 7* or d j and \A\ = cc + dd > 0. The ring Q is a linear space of 
order 4 over the field of all real numbers and it has the matrices I,A,BjAB 
of (4) as a basis over that field. It is usually called the ring of real quater- 
nions. 

Until the present we have restricted the term "field" to mean a field con- 
taining the field of all rational numbers. We now define fields in general. 

DEFINITION. A field is a ring F such that F* is a multiplicative abelian 
group. 

The identity element of the multiplicative group F* is then the unity 
element 1 of F. The whole set F is an additive group with identity element 
0, and 1 generates an additive cyclic subgroup [1]. If this cyclic group has 
infinite order, it may be shown to be equivalent to the set of all ordinary 
integers. But F is closed with respect to rational operations, and [1] then 
generates a subfield of F equivalent to the field of all rational numbers. We 
call all such fields nonmodular fields. 

The group [1] might, however, be a finite group. Its elements are then 
the sums 

(14) 1 = 0, 1, 2, . . . , p - 1 , 

where p is the order of this group, and we have the property that the sum 
1 + 1 + . . . + 1 with p summands is zero. It is easy to show that p is a 
prime, and it follows that if a is in F, then the sum a + a + . . . + a with 
p summands is equal to the product (1 + 1 + + 1)0 = 0. We call 
such fields F modular fields of characteristic p. It may easily be shown that 
the characteristic of all subfields of a field F is the same as that of F y and, 
in fact, every subfield of F contains the subfield generated by the unity ele- 
ment of F under rational operations. 

5. Integral domains. A commutative ring with a unity element and with- 
out divisors of zero is called an integral domain. Any field forms a somewhat 
trivial example of an integral domain. Less trivial examples are the set 
F[x] of all polynomials in x, the set F[XI, . . . , x q ] of all polynomials in 
1, . . . , x q and coefficients (in both cases) in a field F, the set of all ordinary 
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integers. The set of all integers may be extended to the field of all rational 
numbers by adjoining quotients a/6, 6/^0. In a similar fashion we adjoin 
quotients a/6 for a and b ^ in J to imbed any integral domain J in a field. 

When we studied polynomials in Chapter I we studied questions about 
divisibility and greatest common divisor. We may also study such ques- 
tions about arbitrary integral domains. Let us then formulate such a study. 

Let J be an integral domain and a and 6 be in J. Then we say that a is 
divisible by b (or that b divides a) if there exists an element c in J such that 
a = be. If u in J divides its unity element 1, then u is called a unit of J. 
Thus u is a unit of J if it has an inverse in J. The inverse of a unit is clearly 
also a unit. 

Two quantities b and 6 are said to be associated if each divides the other. 
Then b and 6 are associated if and only if 6 = bu for a unit u. Moreover, 
if b divides a so does every associate of b. Every unit of J divides every a 
of J. Thus we are led to one of the most important problems about an inte- 
gral domain, that of determining its units. 

A quantity p of an integral domain J is called a prime or irreducible quan- 
tity of J if p ^ is not a unit of J and the only divisors of p in J are units and 
associates of p. Every associate of a prime is a prime. A composite quantity 
of J is an a ^ which is neither a prime nor a unit of J. It is natural then 
to ask whether or not every composite a of J may be written in the form 

(15) a = pi . . . p r 

for a finite number of primes p of J. We may also ask if it is true that 
whenever also a = qi . . . q a for primes #,-, then necessarily s = r and the qi 
are associates of the p/ in some order. When these properties hold we may 
call J a unique factorization integral domain. The reader is familiar with the 
fact that the set of all ordinary integers is such an integral domain. This 
fact, as well as the corresponding property for the set of all polynomials in 
Xij . . . , x n with coefficients in a field F are derived in Chapter II of the 
author's Modern Higher Algebra. 

The problem of determining a g.c.d. (greatest common divisor) of two 
elements a and b of a unique factorization domain is solvable in terms of the 
factorization of a and 6. However, we saw that in the case of the set F[x] 
the g.c.d. may be found by a Euclidean process. Let us then formulate the 
problem regarding g.c.d. 's. We define a g.c.d. of two elements a and b not 
both zero of J to be a common divisor d of a and b such that every common 
divisor of a and 6 divides d. Then all g.c.d. ; s of a and b are associates. We 
call a and b relatively prime if their only common divisors are units of J, 
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that is, they have the unity element of J as a g.c.d. We now see that one of 
the questions regarding an integral domain J is the question as to whether 
every a and b of J have a g.c.d. Moreover, we must ask whether there exist 
x and y in J such that 

d = ax + by 

is a common divisor and hence a g.c.d. of a and b; and finally whether or not 
d may be found by the use of a Euclidean Division Algorithm. 

6. Ideals and residue class rings. A subset M of a ring R is called an 
ideal of R if M contains g h, ag, ga, for every g and h of M and a of R. 
Then M either consists of zero alone and will be called the zero ideal of R, 
or M may be seen to be a subring of R with the property that the products 
am, ma of any element m of M and any element a of 1? are in M. 

If H is any set of elements of a ring J?, we may designate by {H\ the set 
consisting of all finite sums of elements of the form xmy for x and y in #, 
m in H. It is easy to show that {H} is an ideal. If H consists of finitely 
many elements mi, . . . , m t of jR, we write {H} = {mi, . . . , m/}, and if H 
consists of only one element m of R y we write 

(16) M = {m} 

for the corresponding ideal. This most important type of an ideal is called 
a principal ideal. It consists of all finite sums of elements of the form amb 
for a and 6 in R. When J? is a commutative ring, M = {m} consists of all 
products am for a in R. 

The ring R itself is an ideal of R called the unit ideal. This term is de- 
rived from the fact that in the case where R has a unity quantity R = { 1 } . 
Evidently {0} is the zero ideal. 

Let M be an ideal of R and define 

(17) o s 6 (M) 

(read: a congruent b modulo M) if a b is in M. We may then define 
what we shall call a residue class g of M for every a of R. We put into the 
class every b in R such that a = 6 (M). Clearly a = b (M) if and only if 
b 33 a (Af). Moreover, if a 6 is in M and b c in Af, then (a 6) + 
(6 c) = a c is in M. It follows that g = 6 (a and b are the same resi- 
due class) if and only if b is in g. 

Let us now define the sum and product of residue classes by 

(18) g + b = a + b , g b = a- b . 
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It may be verified readily that if a\ = g, 61 = b , then ai + b\ = a + b, 
aifri = 06. It follows that our definitions of sum and product of residue 
classes are unique. Then it is easy to show that if M is not R the set of all 
the residue classes forms a ring with respect to the operations just defined. 
We call this ring the residue class or difference ring R M (read: R minus 
M). When M = R the residue classes are all the zero class, and we have 
not called this set a ring. 

When the residue class ring R M is an integral domain, we call the 
ideal M a prime ideal of R. We call M a divisorless ideal of R if R M is 
a field. These concepts coincide in the case where R M has only a finite 
number of elements since it may be shown that any integral domain with a 
finite number of elements is a field. This coincidence occurs in most of the 
topics of mathematics (in particular the Theory of Algebraic Numbers) where 
ideals are studied. 

7. The ring of ordinary integers. The set of all ordinary integers is a ring 
which we shall designate henceforth by E. It is easily seen to be an inte- 
gral domain, and we shall prove that it has the property of unique factori- 
zation. 

We observe first that the units of E are those ordinary integers u such 
that uv = 1 for an integer v. But then 1 and 1 are the only units of E. 
Thus the primes of E are the ordinary positive prime integers 2, 3, 5, etc., 
and their associates 2, 3, 5, etc. Every integer a is associated with 
its absolute value | a \ ^0. We note now that if b is any integer not zero, 
the multiples 

(19) 0, |6|, -|6|,2|&|, -2|6|,... 

are clearly a set of integers one of which exceeds* any given integer a. 
Then let (g + 1) |6| be the least multiple of |6| which exceeds a so that 
g\b\ <a,(g + 1)|6| > a, a - g\b\ = r such that < r < \b\. We put 
q = giib > Qj q = g otherwise, and have g\b\ = <?6, a = bq + r. If also 
a = bqi + n with < r\ < \b\, then b(q q\) = n r is divisible by 6, 
whereas | r n | < | b \ . This is possible only if r\ = r and q\ = q. We 
have thus proved the Division Algorithm for E, a result we state as 

Theorem 1. Let a and b ^ be integers. Then there exist unique integers 
q and r such that 0<r< 1 6 1 , a = bq + r. 

We now leave for the reader the application of the Euclidean process, 
which we used to prove Theorem 1.5, to our present case. The process yields 

* We use the concept of magnitude of integers throughout our study of the ring E. 
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Theorem 2. Let f and g be nonzero integers. Then there exist integers a 
and b such that 

(20) d = af + bg 

is a positive divisor of both f and g. Then d is the unique positive g.c.d. of f 
and g. 

The result above implies 

Theorem 3. Let f, g, h be integers such that f divides gh and is prime to g. 
Then f divides h. 

For by Theorem 2 we have af + bg = 1, afh + fy/A = h. But by hypothesis 
gh = /g, A = (ah + bq)f is divisible by /. 

We then have 

Theorem 4. Let p be a prime divisor of gh. Then p divides g or h. 

For if p does not divide gr, the g.c.d. of p and g is either 1 or an associate 
of p. The latter is impossible, and therefore p is prime to g. 

We also have 

Theorem 5. Let m be an integer. Then the set of integers prime to m is 
closed with respect to multiplication. 

For let a and b be prime to m and d be the g.c.d. of ab and m. If a6 is not 
prime to m we have d > 1, and by Theorem 3 if d is prime to a it divides 6. 
But then a divisor c > 1 of d divides a or b as well as m contrary to hy- 
pothesis. 

We may now conclude our proof of what is sometimes called the Funda- 
mental Theorem of Arithmetic. 

Theorem 6. Every composite integer a is expressible in the form 

(21) a = pi...p r 

uniquely apart from the order of the positive prime factors pi, . . . , p r . 

For if a = 6c, every divisor of b or of c is a divisor of a. If a is composite, 
it has divisors b such that 1 < b < |a|, and there exists a least divisor 
Pi > 1 of a. But then pi is a positive prime, a = p\Oz for \a^\ < \a\. If 
Oz is a prime, we write a 2 = pz with p 2 a positive prime and have (21) for 
r = 2. Otherwise a* is composite and has a prime divisor p% by the proof 
above, a 2 = p 2 a 3 and a = pip 2 a 3 f r l a s| < |aa| . After a finite number of 
stages the sequence of decreasing positive integers |a| > |a 2 | > |as| > 
. . . must terminate, and we have (21). If also a = qi . . . q, for positive 
primes gi, . . . , g, the sign is uniquely determined by a, and p\ . . . p r = 
qi . . . g.. Then either we may arrange the q* so that gi = pi, or pi ^ g/ 
forj = 1, . . . , s. But if the divisor pi of qi . . . q 8 is not equal to gi, it does 
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not divide qi and by Theorem 4 must divide q z . . . q 8 . By our hypothesis 
it does not divide # 2 and must divide g 3 . . . q a . This finite process leads to 
a contradiction. Thus pi = gi, p2 ... p r = ?*... ?, and the proof just 
completed may be repeated, and we may take p 2 = q*. Proceeding simi- 
larly, we ultimately obtain r = s, the p = qi for an appropriate ordering* 
of the q^ 

8. The ideals of the ring of integers. Let M be a nonzero ideal of the 
set E of all integers and m be the least positive integer in M . Then M con- 
tains every element qm of the principal ideal {m}. If h is in M, we may 
use Theorem 1 to write h mq + r where < r < m. But mq and h are 
in M, /i mq = r is in M . Our definition of m implies that r = and thus 
that every element of M is in {m}. We have proved 

Theorem 7. TTie ideals of E are principal ideals {m} , m a positive integer, 
The residue classes of E modulo {m} are now the classes 

(22) 0, 1, . . . , m - I . 

For if a is any integer, we have a = mq + r for r = 0, 1, . . . , m 1. 
Then a r is in {m} ; a = r. Thus the elements of the residue class ring 

E - {m} 

defined for m > 1 are given by (22). Then E {m} is a ring whose zero 
element is the class of all integers divisible by m and whose unity element 
is the class 1 of all integers whose remainder in Theorem 1 on division by 
m is 1. 

If a is an integer prime to m, the elements of the residue class a are all 
prime to m. For by Theorem 2 there exist integers c and d such that 

(23) ac + md = 1 . 

If 6 is in a, then b = a + mq, be + m(d qc) = ac + m(qc + d qc) = 
ac + md = 1, and therefore 6 and m are relatively prime. But a c = 1, 
and Theorem 7 implies 

Theorem 8. The residue classes a in E {m} defined for a prime to m 
form a multiplicative abelian group. 

If m is a composite integer cd where c > 1, d > 1, then m > c, m > d, 
and c and d are both not the zero class. But C'd = cd = m = Q, E {m} 
has divisors of zero and is not an integral domain. If m is a prime, then 

* We may order the p { so that pi ^ p 2 ^ . . . ^ p r and similarly assume that <?i ^ 
02 ^ . . . ^ 7.. Then we obtain r = 5, p, = & for i = 1, . . . , r. 
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every a not in m is prime to w, and Theorem 8 states that E {w} is a 
field. We have proved 

Theorem 9. An ideal M of E is a prime ideal if and only ifM. = {p} for 
a positive prime p, M is a divisorless ideal. 

We observe now that a = 6 (M) for M = {m} means that a b = mq, 
that is, a b is divisible by m. Thus it is customary in the Theory of Num- 
bers to write 

(24) a a b (mod m) 

(read: a congruent 6 modulo m) if a 6 is divisible by m. But then a = 6, 
and if we also have c == d, we will have g + c = 6 + das well as g c = 
6 d. Hence if (24) and 

(25) c 3= d (mod m) 
hold, we have 

(26) a + c s 6 + d (mod m) , ac = 6d (mod m) . 

Thus the rules (26) for combining congruences are equivalent to the defini- 
tions of addition and multiplication in E {m}. 

We next state the number-theoretic consequence of Theorem 8 and Lem- 
ma 2 which is called Euler's Theorem and which we state as 

Theorem 10. Let f (m) be the number of positive integers not greater than 
m > and prime to m. Then if a is prime to m we have 

(27) a f s 1 (mod m) . 

For /(m) is clearly the order of the multiplicative group defined in Theo- 
rem 8. Our result then follows from Lemma 2. 
We next have the Fermat Theorem. 
Theorem 11. Let pbea prime. Then 

(28) a p = a (mod p) 

for every integer a. 

For (28) holds if a is divisible by p. Otherwise a is prime to p, and a is 
one of the residue classes 1,2,..., p 1. Thus f(p) = p 1 and by 
Theorem 10 aP~ l 1 is divisible by p, a(o^~ 1 1) = a p a is also di- 
visible by p. 

The ring E {p} defined by a positive prime p is a field* P whose 

* This field is equivalent to the subfield generated by its unity quantity of any field 
of characteristic p. 
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nonzero elements form an abelian group P* of order p 1. The elements 
of P* are the distinct roots of the equation x p ~~ l = I and are not all roots 
of any equation of lower degree. Then it may be shown that P* is a cyclic 
multiplicative group [r] where r is an integer such that 1, r, rf , . . . , r p " 2 are 
a complete set of nonzero residue classes modulo {p } . Such an integer r is 
called a primitive root modulo p. We then have 

Theorem 12. Let p be a prime of the form 4n + ! Then there exists an 
integer t such that 

(29) t 2 + 1 a (mod p) . 

For p 1 = 4n, and we let r be a primitive root modulo p, t = r n . Then 
t* = r -i, t* - 1 - (P + 1)(P - 1), ( + !)( - 1) = in the field E - 
{p}. However, ft = r^y* I since r is primitive, ^ + 1 = as desired. 

There is a number of other results on congruences which are corollaries 
of theorems on rings and fields. However, we shall not mention them here. 

9. Quadratic extensions of a field. If F is a subfield of a field K which 
is a linear space u\F + . . . u n F over F, we say that K is a field of degree 
n over F. The theory of linear spaces of Chapter IV implies that Ui may be 
taken to be any nonzero element of K. Hence we may take Ui to be the 
unity element 1 of F. Then if n = 1, the field K is F. We call K a quad- 
ratic, cubic, quartic, or quintic field over F according as n = 2, 3, 4, or 5. 

Let n = 2 so that K has a basis u\ = 1, u 2 over F. The quantities of K 
are then uniquely expressible in the form fc = c\ + c 2 u 2 for ci and c 2 in F, 
and fc is in F ) k = fc 1 + Ow 2 , if and only if c 2 = 0. Clearly, if k is in F, 
then 1, fc do not form a basis of X over F. We now say that a quantity u 
in K generates K over F if 1, u are a basis of K over F. Then u generates K 
over F if and only if u is not in F. For if 1, u are linearly dependent in F we 
have ai + a z u = for a 2 7* 0, u= o^ l a\ is in F. 

The elements k of a quadratic field have the property that 1, fc, fc 2 are 
linearly dependent in F, cofc 2 + cjc + c 2 = for c , ci, c 2 not all zero and 
in F. If c = 0, then Ci cannot be zero, and k = -c^ 1 is in F, fc is a root 
of the monic polynomial (x fc) 2 with coefficients in F. If fc is not in F, 
then Co T^ 0, fc is a root of a monic polynomial of degree two. Thus evgry 
element fc of a quadratic field is a root of an equation 

(30) f(x, fc) = x 2 - T(k)x + N(k) = 

with T(fc) and N(k) in F. In particular, if u generates K over F we have 

(31) u* - bu + c = 
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for b and c in F, and we propose to find the value of T(K) and N(k) as poly- 
nomials in b, c and the coordinates fci and fc 2 in F of k = fci + fc2W. 
Define a correspondence S on K to X by 

(32) k = fci + to - fc s = ki + k 2 u s , 

for all fci and fc 2 in F where u s is in K. Then (32) is a one-to-one corre- 
spondence of K and itself if and only if u s generates K over F. If fc = 
fcs + kiu for fca and ki in F we have kfi k$ + k^u 8 , k + fco = (fci + fcs) + 
(fc 2 + fcOw, so that (fc + fc ) s = fci + fcs + (fca + fcOu 5 , and we have 

(33) (fc + fc ) s = k s + kj . 



Also fcfco = fcifcs + (fci&4 + kjc^u + fc 2 fc 4 u 2 = (fcifcs fc 2 fc 4 c) + (fcifc 4 + 
u, while k s kH = fcifcs + (fcifo + k^k^u 3 + kJd(u 8 )*. But then 



(34) (fcfco) 5 = 

if and only if (u s ) 2 = bu s c, that is, u s is a root in K of the quadratic 
equation x z bx + c = 0. But the quadratic equation can have only two 
distinct roots in a field K, the sum of the roots is 6, 

(35) u s = u or 6 u . 

In the former case S is the identity correspondence fc < > fc. In either case 
$ defines a self-equivalence of K leaving the elements of F unaltered and 
is called an automorphism over F of K. 

If K were any field of degree n over F and if S and T were automorphisms 
over F of K, we would define ST as the result fc > k ST = (fc 8 ) 2 " of apply- 
ing first k>k s and then k s > (fc s ) r . It is easy to show that the set of all 
automorphisms over F of K is a group G with respect to the operation just 
defined. In case G has order equal to the degree n of K over F it is called 
the Galois group of K over F, and that branch of algebra called the Galois 
Theory is concerned with relative properties of K and G. In our present 
case u s * = (u s ) s = (b u) s = b (b u) = u so that S 2 is the identity 
automorphism. But b w = u if and only if 2w = 6, which is not possible 
since u is not in F unless K is a modular field in which u + u = 0. Hence 
if K is a nonmodular quadratic field over F, the automorphism group of K 
over F is the cyclic group [S] of order 2 and is the Galois group of K. 

We see now that if fc = fci + fc 2 u, then fcfc^ = (fci + fc 2 u)[fci -f fc 2 (6 
u)] = k\ + fcifc 2 6 + fc|u(6 u). But bu w 2 = c. Hence if 



(36) T(fc) = fc + fc 5 , N(k) = kk s , 
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we have 

(37) T(k) = 2fci + k t b , N(k) = k\ + k\c + fakj) , 
and the polynomial of (30) is 

(38) /(*, fc) = (x - fc)(x - k s ) . 

The function T(k) is called the trace of k, and N(k) is called the norm of fc. 
Moreover, we may show that 

(39) T(ajc + a 2 fc ) = aiT(fc) + a 2 r(fc ) , N(kk Q ) = N(k) N(k Q ) 



for every ai and a 2 of F and fc and fc of K. For T(aifc + asfco) = (ifc + a 2 fc ) + 
(aifc + a 2 fc ) s = ajc + a 2 fc + ^Jc s + (hkj! = ai(fc + k s ) + 02(^0 + kfi) as 
desired. Similarly,* N(kk Q ) = kk Q (kko) s = kk Q k s k$ = (kk s )(k Q k$). Note that 
if fc is in F, then JV(Jfc) = fc 2 . 

Let us now assume that the field K is nonmodular so that K = F + uF 
where u satisfies (31). Then K is also generated by the root u %b of the 
equation 

(40) z 2 = a (a in F) , 

if a = (w 6/2) 2 = u 2 bu + & 2 /4 = c + 6 2 /4. Let us then assume with- 
out loss of generality that u is a root of (40) so that in (31) we have 6 = 0, 
c = a. Then (37) has the simplified form given by 

(41) T(k) = 2*i , N(k) = k\ - k\a . 

Now if a = d? for d in F, we have u 2 = d 2 , (u + d) (u d) = 0, whereas u is 
not in F and w + d^O, u d^O. This is impossible in a field. 

The quantities of K now consist of all polynomials in u with coefficients 
in F. For if k(x) is any polynomial in F[x], we may write k(x) = ki(x 2 ) + 
fc 2 (x 2 )x, fc(w) = fci(a) + ki(a)u. Thus every nonmodular quadratic field is the 
ring F[u] of all polynomials with coefficients in F in an algebraic root u of an 
equation x 2 = a where a is in F, a is not the square root of any quantity of F, 
and F is nonmodular. Since K is a field, it is actually the field F(u) of all 
rational functions in u with coefficients in F. 

Conversely, if the nonmodular field F and the equation x 2 = a in F are 
given, then K is defined. For we may take 



< "(?;) " (; ) 



* The trace function is thus called a linear function and the norm function a multipli- 
cative function. 
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and identify F with the set of all two-rowed scalar matrices with elements 
in F (so as to make K contain F). Then every polynomial in u has the form 
k = ki + k*u and N(k) = k\ - k\a = if and only if ki = fc 2 = k = 0. 
For otherwise fc 2 7^ 0, a = (fti/if 1 ) 2 contrary to hypothesis. But then we 
take K = F + uF and have fc~ l = (fc? kla)' 1 ^ k 2 u) for every fc of 
K. Hence K is the field F(u). That K is nonmodular is evident since the 
unity element of K is that of F. We have thus constructed all quadratic 
fields K over a nonmodular field F in terms of equations x 2 = a for a in 
F, a 7^ d 2 for any d of F. 

Observe in closing that if K = F(u), then K F(v) for every v = bu 
such that fc 7^ is in F. But v is a root of 

(43) z 2 = b 2 a . 

Thus we may replace the defining quantity a in F by any multiple 6 2 a for 
b 7^ in F. It is also shown easily that if K is defined by a and K by a , 
then K and KQ are fields equivalent over F if and only if a = 6 2 a. 

10. Integers of quadratic fields. The Theory of Algebraic Numbers is 
concerned principally with the integral domain consisting of the elements 
called algebraic integers in a field K of degree n over the field of all rational 
numbers. We shall discuss somewhat briefly the case n = 2. 

Let, then, K be a quadratic field over the rational field so that K is 
generated by root u of u 2 = a where a is rational and not a rational square. 
By the use of (43) we may multiply a by an integer and hence take a inte- 
gral. Write a = c 2 d where d has no square factor and c and d are ordinary 
integers. If we take 6 = c" 1 in (43) we replace a by d. Hence every quadrat- 
ic field is generated by a root u of the quadratic equation 

(44) x 2 = a = pi . . . p r , 

where the p< are distinct positive primes and r > I for a > 0, while if 
a = 1 we interpret (44) as the case a negative and r = 0. 

The quantities k of K have the form k = k\ + k^u where fci and k% are 
ordinary rational numbers. We call k an integer of K if the coefficients 
T(k) y N(k) of (30) are integers. Thus k is an integer of K if and only if 
2ki and k\ k\a are both ordinary integers. We shall determine all integers 
of K stating our final results as 

Theorem 13. The integers of a quadratic field 'Kform an integral domain J 
containing the ring E of all ordinary integers. Then J consists of all linear 
combinations 

(45) Ci + c 2 w 
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for Ci and 02 in E, where w = u if a = 2 (mod 4) or a = 3 (mod 4) but 

(46) w = l + 

i/ a = 1 (mod 4). 

Note that a ^ (mod 4) since a has no square factor. We write 

I. _ ^1 7 __ &2 7 __ 

A/1 - ,, A/2 - 7 > fv - 



7 > v - 1 

00 00 

for bo, bi, b 2 in E and 6 the positive least common denominator of fci and fc 2 . 
Then 1 is the g.c.d. of 60, 61, b 2 . Now 2fci is an integer, 60 divides 2bi, 
k\ k%a = bJ7 2 (6? b|a), so that b 2 , divides b\ b|a. If p were an odd prime 
factor of 6 , it would divide 2bi only if it divided 61. But then p 2 would 
divide b?, and p 2 would divide 6? b\a as well as b\a. Since a has no 
square factors this is possible only if p divides 62, a contradiction. Hence 
6 is a power of 2. If b > 4 divides 2bi, then 2 divides bi, 4 divides b? and 
b|a, and thus 2 divides 6 2 - We have a contradiction and have proved that 
6 = 1, 2. If 6 = 2 and 61 is even, then 4 divides b\a, and hence 2 
divides 6 2 contrary to the definition of 6 . Similarly, if 6 2 were even, then 
4 would divide 6f, a contradiction. Hence 61 = 2wi -f 1, 62 = 2?w 2 + 1 for 
ordinary integers m l and ra 2 , and 6f b\a = 4[mf + m>i a(^ 2 + m 2 )] + 
1 a is divisible by 4 if and only if a = 1 (mod 4). Thus we have proved 
that J consists of the elements (45) with w = u if a s= 2, 3 (mod 4). But 
if a = 1 (mod 4), then we have shown that either fc = 61 + b 2 u with b, 
and 6 2 in , 6 = 1,'or fc = mi + m 2 u + w for w in (46). However, w = 
2w 1, and in either case fc has the form (45) with Ci and c 2 in E. 

It remains to show that J is an integral domain. The elements of J are 
in the field K, and thus it suffices to show that fc + A, kh are in J for every 
fc and h of J. But the sum of fc and h of the form (45) is clearly of that 
form, the product 



is of the form (45) if w 2 is of that form. But this condition holds since ii 
w = u then w 2 = a + Qw, while otherwise v? = a = 4m + 1 for an inte- 
ger m, T(w) = J + i = 1, #(tiO = i(l ~ a) = -m, ty 2 - w - m = 0, 

(47) w* = m + w . 

This completes the proof. 
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The units of J are elements k such that kh = 1 for h also in J. Then 
N(kh) = N(K)N(h) = 1, N(k) is an ordinary integer which divides unity, 

(48) N(k)=l. 

Conversely, if (48) holds, k has the property kk s = 1 or 1. But k s is in 
J when k is in J since T(k 8 ) = r(*) and N(k s ) = tf (*). Hence fc* = Ar 1 or 
fc 5 = fc-i. Thus (48) is a necessary and sufficient condition that k be a 
unit of /. 

If a s 2, 3 (mod 4) so that fc = Ci + c 2 u, then (48) is equivalent to 

(49) c\ -cia= 1. 

However, if a = 1 (mod 4), then (48) becomes N(k) = c\ + CiC 2 elm = 
1 so that 4N(k) = 4c? + 4cic 2 + c\ - (4m + l)c| = 4. But this is 
equivalent to 

(50) (2ci + c 2 ) 2 - c|a = 4. 

We may determine the units of J completely and simply in case a is 
negative. For both (49) and (50) have the form x\ + x\g = 1, 4 for 
g = a > 0, and this is possible for ordinary integers x\ and # 2 and </ > 4 
only if #2 = 0, c\ = 1, 1. Now a has no square factors, and hence g ^ 4, 
the only possible remaining cases are gr = 1, 2, 3. If g = 2 we have x\ + 
2^1 1 on ly ^ #2 = 0. Hence we have proved that the units of J are 1, 1 
for every a < save only a = 1, 3. 

Now let a 1 so that (49) becomes c\ + ci = 1. Then one of ci and c 2 
is zero, the other is 1 or 1, and the units of J are 

(51) 1, u, u, u 2 = 1 . 

In the remaining case a = 3 we use (50) and have (2ci + c 2 ) 2 + 3cl = 
4, and if c 2 ^ we must have c 2 = 1, 2ci + c 2 = 1 with any choice 
of signs. Then c 2 = 1 gives c\ = or 1, while c 2 = 1 gives or 1 
for CL Clearly, the units of J are 

(52) 1, 1, w, -w, w s , -w s . 

The units of J form a multiplicative group, and we have shown that if a is 
negative this group is a finite group. 

If a = 2, then h = 3 + 2u has norm 9 - 4w 2 = 9 - 8 = 1. Hence h is 
a unit of J, and so is A 1 for every integer L If h* = h 9 for s 7* t, we may take 
t > 8 and have A'""* = 1. But we may regard u as the ordinary positive V2 
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and have h > 5, ft 1 "* > 5*~ 8 > 1. Hence the multiplicative group of the 
units of J is an infinite group. It may similarly be shown to be infinite for 
every positive a. 

We shall not study the units of quadratic fields further but shall pass on 
to some results on primes and prime ideals in two special cases. 

11. Gauss numbers. The complex numbers of the form x + yi with ra- 
tional x and y are called the Gauss complex numbers, and those for which 
x and y are integers are called the Gauss integers. Then the Gauss complex 
numbers are the elements of a quadratic field K of our study with 

u- -1, 

and the Gauss integers comprise its integral domain J. We have deter- 
mined the units 1, u of J and shall now study its divisibility properties. 
Our first result is the Division Algorithm which we state as 

Theorem 14. Let f and g ^ be in J. Then there exist elements h and r 
in J such that 

(53) f = gh + r 

and < N(r) < N(g). 

For fg- 1 is in K, fg~\ = k\ + k^u with rational fci and fc 2 . Every rational 
number t lies in an interval s < t < s + 1 for an ordinary integer s. If 
s < t < s + \ then (<-)< i while ifs + <J<s+l then \t - 
(s + 1) | < i. Hence there exist ordinary integers h\ and h% such that 

(54) ai^fci-fti, S 2 = fe 2 -ft2, |*i|<J, |a|<i- 

Put ft = hi + hju, s = Si + s 2 u, r = sg so that fg~ l = h + s, / = gh + 
sg = gh + r. Then N(s) = si + s\ < J, tf (r) = N(s)N(g) < N(g) as de- 
sired. 

We observe that the quotient h and the remainder r need not be unique in 
our present case. For example, if / = 2 + u and g = 1 + u, we have 
N(g) = 2. Then/ =l-g + l = 2-0-u with N(l) = Ar(-u) = 1. 

We shall use the Division Algorithm to prove the existence of a ggc.d. 
Our proof will be different and simpler than that we gave in the case of 
polynomials and indicated in the case of integers but has the defect of being 
merely existential and not constructive. We first prove 

Theorem 15. Every ideal M of J is a principal ideal. 

For the norms of the nonzero elements of M are positive integers, and 
there exists an m 7* in M such that N(m) < N(f) for every / of M. By 
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Theorem 14 we have / = mh + r for h and r in J and N(r) < N(m). But 
/, m and mh are in M , so is r = f wA, and it follows that r must be zero. 
Thus/ = mh, M = {m}. 

The above result then implies 

Theorem 16. Let f and g 6e nonzero elements of J. Then there exist b and c 
in J such that 

(55) d = bf + eg 

is a common divisor of f and g. TTms d is a g.c.d. of f and g. 

For the set of all elements of the form xf + yg with x and y in J is an 
ideal M of J. By Theorem 15 we have M = {d} , d in M has the form (55) 
and divides / = 1 / + 00, and g = 0/+lginM. 

The above result may now be seen to imply that if / divides gh and is 
prime to g, then / divides h, and also if p is a prime of J dividing g h, then p 
divides g or h. Moreover, if / is a composite integer of J, then / = g h for 
nonunit g and h, N(g) < N(f). Then if p is a divisor of/ of least positive 
norm it is a prime divisor of /, and we continue the proof of Theorem 6 
to obtain 

Theorem 17. Every composite Gauss integer is expressible as a product 

(56) f = pi . . . p r 

of primes pi which are determined uniquely by f apart from their order and 
unit multipliers. 

We observe that Theorem 17 implies that an ideal M of J is a prime ideal 
(and, in fact, a divisorless ideal) if and only if M = {d} for a prime d of J. 
Let us then determine the prime quantities d = di + d*u of J. We see first 
that if d is a prime of J, so is d s = d\ d z u. For if d s = gh with N(g) ^ 1, 
N(h) * 1, we have d = g s h s for N(g s ) = N(g), N(h s ) = N(h), and d is 
composite. We now prove 

Theorem 18. A positive prime p of E is either a prime of J or the product 
N(d) = dd s , where d is a prime of J. Every prime d of J is either associated 
with a positive prime of E or arises from the factorization in J of a positive 
prime p = dd s of E which is composite in J. 

For if p is a positive prime of E and is composite in J, then p = dk for 
N(d) > 1, N(k) > 1, N(p) = p* = N(d)N(k). But then p = N(d). lid = 
gh with N(g) > 1, then N(d) = p = N(g)N(h) so that N (h) = 1, h is a 
unit, d is a prime. Conversely, let d be a prime of J. Then c = N(d) is a 
positive integer of E and is either a prime or is composite. But c = dd s can 
have at most two prime factors in E, c = pp Q for positive primes p and p 
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of E. We may assume that p is associated with d and po with d s so that 
PQ = PQ is associated with d and with p. Since p and po are both in E, we 
have PQ = p. But p and p are positive and must be equal, N(d) = p 2 . 
Therefore d is associated with the prime p of E which is prime in J. 

We now clearly complete our study of the primes of J by proving 

Theorem 19. A positive prime p of E is a prime of J if and only if p has 
the form 4m + 3. 

To prove this result we observe first that if is an odd integer of E we 
have t = 2s + 1, t* = 4s 2 + 4s + 1 = 4s(s + !) + !. One of s and s + 1 
is even, t 2 = 8r + 1 = 1 (mod 8). If t is even, we have t 2 = (mod 4) 
and t 2 55 0, 4 (mod 8). Thus a sum of two squares is congruent to 0, 1, 2, 4, 
or 5 modulo 8 while 4n + 3 is congruent 3 or 7 modulo 8. It follows that 
p = 4n + 3 7* x* + y 2 . We now assume that p = gh for g and h in J and 
have p 2 = A'Xp) = N(g)N(h). If neither gr nor h were a unit, we would 
have N(g} > 1, and both of these integers would be divisors of p 2 . But 
then]V(g) = N(h) = p = x 2 + y 2 , which is impossible. Hence, p = 4n + 3 
is prime in J. We note that 2= 1 + 1 = (l + w)(l w)is composite in 
J and that it remains to show that p = 4n + 1 is composite in J. We know 
by Theorem 12 that there exists an integer b in E such that b 2 + 1 is divisible 
by p. If p divides b + u or 6 u, then 6 u = p(fci + fc 2 tO, an d 1 == 
pfc 2 , which is impossible. But a prime p of J cannot divide the product 
(b + u)(b u) = b 2 + 1 without dividing one of its factors 6 + u } b u, 
Hence p is not a prime of /. This completes the proof. 

We use the result above to derive an interesting theorem of the Theory of 
Numbers. We call a positive integer c a sum of two squares if c = x 2 + y 2 
for x and y in E. Then we have 

Theorem 20. Write c = f 2 g where f and g are positive integers and g has 
no square factors. Then cisa sum of two squares if and only if no prime factor 
of g has the form 4n + 3. 

For if c = x 2 + y 2 = (x + yu)(x yu), we may write x + yu = di 
. . . d r for primes d in J, c = N(di) . . . N(d r ). Then N(di) = p is a prime 
of E if and only if p 7^ 4n + 3 and otherwise N(di) p?. Thus the prime 
factors of c of the form 4n + 3 occur to even powers and are not factors of g. 
Conversely, if g = pi . . . p r for positive primes p< of Snotof the form 4n + 
3, we have p< = N(di) for d< in J, g = tf (di) . . . N(d r ) = #(di . . . d r ), 
and c = N(fdi . . . d r ) = JV(fc) = z 2 + t/ 2 f or ifc = x + i/w in J. 

Note in closing that a positive prime p of the form 4n + 3 divides x* + y 2 
if and only if p divides both a: and y. For p is a prime of J and divides 
(# + yu)(x yu) if and only if p divides either x + yu or x yu. Then 
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x yu = p(ki k z u), x = pfci, y = p& 2 for integers ki and fc 2 of E. It 
follows, then, that if x, y, z are ordinary integers such that 

(57) z 2 + 2/ 2 = z 2 , 

every prime divisor p = 4n + 3 of z divides both x and y. Then let x y y, z y 
have g.c.d. unity. It follows that the odd prime divisors of z have the form 
4n + 1. We may show readily also that x and y cannot both be even or be 
odd and that z must be odd.* 

12. An integral domain with nonprincipal ideals. We shall close our ex- 
position with a discussion of some properties of the ring J of integers of 
the field K defined by a = u* = 5. Since 5 = 3 (mod 4), the elements 
of J have the form k\ + k^u for k\ and fc 2 in the set E of all integers. Ob- 
serve that if 6 is an integer of E which is a divisor in J of k\ + fc 2 w, then 6 
must divide both k\ and fc 2 . For k\ + k^u = b(hi + h^u), k\ = bhi, fc 2 = 
bhz. We now prove 

Theorem 22. The elements 3, 7, 1 + 2u, 1 2u are primes of J no two 
of which are associated. 

For if k is a composite of J we have k = gh, N(k) = N(g)N(h) for ordi- 
nary integral proper divisors N(g) and N(h) of N(k). The norms of the 
integers of our theorem are 9, 49, 21, 21, respectively, and the only positive 
proper divisors of these norms are 3, 7. But if g = gi + g*u, we have 
N(g) = g\ + 5gl > 0, g\ + 5gl = 3, 7. Evidently g 2 * 0, g\ < 1. But^l = 1 
is impossible since g\ ^ 2, 2. Thus 3, 7, 1 + 2u y I 2u are primes of J. 
The units of J are 1, 1, and clearly no two of them are associated. 

We see now that 21 = 3 7 = (1 + 2w)(l 2w), and we have factored 
21 into prime factors in J in two distinct ways. Moreover, the principal 
ideal {3} of J defined for the prime 3 is not a prime ideal. For 3 does not 
divide 1 + 2w and 1 2u in J yet does divide their product, and therefore 
the residue class ring J {3} contains 1 + 2u and 1 2u as divisors of 
zero. 

The ring J contains nonprincipal ideals one of which we shall exhibit. 
We let M be the ideal of J consisting of all elements of J of the form 

(58) 3* + y(l + 2u) (x, y in J) . 

If this ideal were a principal ideal {d}, there would exist a common divisor 
d of 3 and 1 + 2u. Since these are nonassociated primes, d must be a unit 

* For further results see L. E. Dickson's Introduction to the Theory of Number 8 1 pp. 
40-42. 
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1 or -1 of J, and \d} = {!}. Then 1 = 36 + (1 + 2u)c for b and c in J, 
7 = 216 + (1 + 2u)7c = (1 + 2u)[fc(l - 2u) + 7c], But 1 + 2w does not 
divide 7, a contradiction. We now proceed to prove that M is a divisorless 
ideal of J and hence is a prime ideal of J. 

We have shown that M is not a principal ideal and does not contain 1. 
Since M contains 3 but not 1 it cannot contain 2. For otherwise 3 2 = 1 
would be in M. Let c be an integer of E in M so that c = 3<j + r where 
r = 0, 1, 2. Since M contains 3g it contains c 3q = r, r = 0. Hence 
the ordinary integers in M are divisible by 3. 

The ideal M contains 3 and 1 + 2u and so contains 

(59) wi = 3 , wz = Zu - (1 + 2u) = w - 1 . 

If /i is in M , then ft = /& + Aau for ft and ft 2 m E,h ft 2 w 2 = h Q + h* in 
E and in M. By the proof above ft + ^2 = 3fti for hi in E, 

(60) h = ftiWi + ft 2 w 2 

We have thus proved that M consists of all quantities of the form (60) for 
hi and ft 2 ordinary integers. Thus we may call wi, w 2 a basis of M over E. 
Note that { 1 } has a basis 1, u over E in this sense. It may be shown that 
every ideal of J has a basis of two elements over E. 

Every integer of / has the form k = ki + k^u = ki + fc 2 w 2 + fo. Write 
fci + fc 2 = 3c + r for c in E and r = 0, 1, 2. Then k = cwi + fc 2 w 2 + r = 
r (M), the elements of J M are the residue classes 0, 1, 2 such that 3 = 0. 
We have proved that J M is equivalent to E { 3 } and is a field, M is 
a divisorless ideal of J. 

This completes our discussion of ideals and of quadratic fields. We shall 
conclude our text with the following brief bibliographical summary. 

Let us begin with references to standard topics on matrices not covered 
in our introductory exposition. The theory of orthogonal and unitary equiv- 
alence of symmetric matrices is contained with generalizations in Chapter V 
of the Modern Higher Algebra and is further generalized and connected with 
the theory of equivalence of pairs of matrices in the author's paper entitled 
' 'Symmetric and Alternate Matrices in an Arbitrary Field," in Transactions 
of the American Mathematical Society, XLIII (1938), 386-436. See also pages 
74-76 and Chapter VI of L. E. Dickson's Modern Algebraic Theories, and 
Chapter VI of J. M. H. Wedderburn's Lectures on Matrices. Both of these 
texts as well as Chapters III, IV, and V of the Modern Higher Algebra in- 
clude, of course, all the material of our Chapters II-V. For a discussion 
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without details but with complete references of many other topics on ma- 
trices see C. C. MacDuffee, The Theory of Matrices ("Ergebnisse der Mathe- 
matik und ihrer Grenzgebiete," Vol. II, No. 5 [pp. 110]). 

The theory of rings as given here is contained in the detailed discussion 
of Chapters I and II of the Modern Higher Algebra, and the theory of ideals 
in Chapter XI. See also the much more extensive treatment in Van der 
Waerden's Moderne Algebra. The units of the ring of integers of a quadratic 
field are discussed on page 233 of R. Fricke's Algebra, Volume III, and its 
ideals on pages 106-10 of H. Hecke's Theorie der algebrdischen Zahlen. We 
close with a reference to the only recent book in English on algebraic num- 
ber theory, H. Weyl's Algebraic Theory of Numbers (Princeton, 1940). 
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inverse, 110 
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Element continued 

unity, 113 

zero, 112, 113 
Elementary: 

divisors, 94 

matrix, 92 
Elementary transformations, 24, 25 

cogredient, 52 

matrices of, 43 
Equations, systems of, 19, 79 
Equivalence, 73, 74 
Equivalence of: 

bilinear forms, 49 

forms, 17 

groups, 110 

linear spaces, 74 

matrices, 47, 89, 92 

quadratic forms, 59 
Euclid's process, 10 
Euler's theorem, 120 

Factor theorem, 5, 99 
Fermat theorem, 120 
Field, 56, 114 

characteristic of, 114 

of complex numbers, 56 

degree of, 121 

generating quantity of, 121 
Forms, 7, 13; see also Bilinear forms 
Function, 73 

characteristic, 101 

minimum, 101 
Fundamental Theorem of Arithmetic, 118 

Galois group, 122 
Gauss integers, 127 
Gauss numbers, 127 
General linear space, 74 
Greatest common divisor, 9 

of Gauss numbers, 128 

of integers, 118 

of polynomials, 9 

process, 10 
Group, 109 

abelian, 110 

additive, 111 

commutative, 110 

cyclic, 111 

order of a, 110 



order of a subgroup of, 110 
subgroup of, 110 
zero element of, 112 
Groups, equivalence of, 110 

Homogeneous polynomial, 7 

Ideal, 116 

divisorless, 117 

prime, 117 

principal, 116 

unit, 116 

zero, 116 
Identity: 

element, 110 

matrix, 30 

Independent vectors, 67 
Integers: 

algebraic, 124 

congruent, 120 

ordinary, 117 
Integral domains, 114 
Integral operations, 1 
Invariant factors, 91 
Inverse : 

element, 110 

mapping, 46 

matrix, 45 
Irreducible quantities, 115 

Leading coefficient, 3, 97 

virtual, 3, 97 
Leading form, 8 
Left division, 98 

Length preserving transformation, 
Linear: 

combination, 15 

dependence, 67 

form, 15 

independence, 67 

space, 66, 74 

subspace, 66, 71 
Linear change of variables, 38 
Linear mappings, 17, 82 

cogredient, 51 

inverse of, 17, 46 

matrix of, 36, 83 

nonsingular, 17 

product of, 36 
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Linear space, 66, 71, 74 

basis of, 67 

order of, 71 
Linear transformations, 84 

orthogonal, 86 

Mappings; see Linear mappings 
Matrices: 

addition of, 59 

commutative, 37 

congruent, 51 

congruent pairs of, 108 

conjunctive, 108 

equivalent, 47, 89, 92 

equivalent pairs of, 108 

orthogonal, 86 

orthogonally equivalent, 108 

products of, 36 

rationally equivalent, 25, 34, 47 

similar, 85, 103 

unitary equivalent, 108 
Matrix : 

adjoint of, 29 

augmented, 80 

of a bilinear form, 49 

characteristic, 101 

characteristic function of a, 101 

of coefficients, 19 

columns of, 20 

determinant of, 27 

diagonal, 29 

diagonal of, 23 

diagonal elements of, 23 

elementary, 92 

of an elementary transformation, 43 

elements of, 20 

equal, 21 

Hermitian, 108 

identity, 30 

invariant factors of, 91 

inverse of, 45 

of a mapping, 36, 83 

minimum function of, 101 

nonsingular, 34 

partitioning of, 21 

principal diagonal of, 23 

rank of, 32 

rectangular, 19 



rows of, 20 

scalar, 30 

skew, 52 

skew-Hermitian, 108 

square, 20 

symmetric, 52, 53, 59, 64 

transpose of, 22 

triangular, 29 

unitary, 108 

zero, 30 

Minimum function, 101 
Minors, 27 

Monic polynomial, 3, 100 
Multiple roots, 5 

n-ary q-ic form, 13 
7i-dimensional space, 66 
Nonsingular : 

linear transformation, 84 

mapping, 84 

matrix, 34 

quadratic form, 55 
Norm in a field, 123 
Norm of a vector, 86 

Order of: 

a group, 110 

a group element, 111 

a linear space, 71 
Orthogonal: 

matrices, 86 

subspaces, 87 

transformations, 86 

vectors, 87 

Point, 66 
Polynomials : 

associated, 5 

coefficients of, 6 

constant, 1 

degree of, 2, 7, 97 

divisibility of, 5 

equal, 2 

factors of, 5 

g.c.d. of, 9 

homogeneous, 7 

identically equal, 6 

monic, 3, 100 

n-ic, 13 
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Polynomials continued 

products of, 3 

g-ary, 13 

rationally irreducible, 12 

relatively prime, 11 

roots of, 5 

scalar, 100 

in several variables, 6 

virtual degree of, 2, 6, 97 

zero, 1 

Positive form, 64 
Prime to a polynomial, 11 
Prime quantity, 115 
Primitive root, 121 
Principal diagonal, 23 

Quadratic field, 121 

integers of, 124 
Quadratic form, 54 

definite, 64 

index of, 62 

negative, 64 

nonsingular, 55 

real, 62 

Quadratic forms, equivalent, 59 
Quantities: 

associated, 115 

composite, 115 

congruent, 6, 11 

irreducible, 115 

prime, 115 

relatively prime, 115 
Quartic field, 121 
Quartic form, 13 
Quasi-field, 57 
Quaternary form, 13 
Quaternions, 114 
Quinary form, 13 
Quotient, 57 

right, left, 98 

Rank of: 

an adjoint matrix, 46 

a bilinear form, 49 

a matrix, 32 

a product, 44 

a quadratic form, 54 

a row space, 72 



Rational: 

equivalence, 25 

functions, 8 

operations, 8 
Real: 

quadratic form, 62 

quaternions, 114 

symmetric matrix, 64 
Reflection, 87 
Relatively prime: 

polynomials, 11 

quantities, 115 
Remainder : 

right, left, 98 

theorem, 4, 98 
Residue classes, 116 
Right division, 98 
Ring, 112 

commutative, 113 

difference, 117 

division, 113 

residue class, 117 
Roots, 5 

multiple, 5 
Rotation of axes, 87 
Row, 69 

rank, 72 

space, 69, 72 
Row by column rule, 37 

Scalar: 

matrix, 30 

polynomial, 100 

product, 16, 41 
Scalars, 19 
Self-equivalence, 107 
Semidefinite forms, matrices, 64 
Sequence: 

elements of, 16 

zero, 16 

Similar matrices, 85, 103 
Skew forms, 53 
Skew matrices, 52 
Space; see Linear space 
Spanning a space, 67 
Subfield, subgroup, 110 
Submatrix, 21 

complementary, 21 
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Subring, 113 
Subspaces, 66, 75, 77 

complementary, 77 

sum of, 77 
Subsystems, 110 
Subtraction, 112 
Symmetric bilinear forms, 54 
Symmetric matrices, 52, 53 

congruence of, 59 

definiteness of, 64 

Ternary forms, 13 
Trace, 123 
Transformations, 84 
Transpose, 22 

of a product, 37 

of a sum, 62 

Unary form, 13 

Unique factorization, 115, 118, 127 



Unit ideal, 116 

Units, 115 

Unity element, 113 

Variables, change of, 38 
Vectors, 66 

linearly independent, 67 

norm of, 86 

orthogonal, 87 

zero, 66 

Virtual degree, 2, 6, 97 
Virtual leading coefficient, 3, 97 

Zero: 

divisors of, 113 
element, 112, 113 
ideal, 116 
matrix, 30 
polynomial, 1 
sequence, 16 
space, 67 



